Terrell Russell, Executive Director

Kory Draughn, Chief Technologist

iRODS Consortium

Delay Server Migration

Delay Server Migration

July 5-8, 2022

iRODS User Group Meeting 2022

Leuven, Belgium

iRODS Delay Queue

  • iRODS rules can be enqueued via 'delay()' to execute later

 

  • Queued Rules live in the iCAT database and are processed by the irodsDelayServer in priority order.

 

  • The irodsDelayServer sleeps most of the time, but spawns an irodsAgent every 30 seconds (by default) to check the Delay Queue for any delayed rules that need to be run.

iRODS Delay Server Architecture

4.2.5

  •  single producer, multiple consumer
  •  moved from processes to threads
  •  refactored to use in-memory set

 

4.2.8

  •  ported to use query processor

 

4.2.9

  •  rule context stored in iCAT

 

4.3.0

  •  cron-like facility (restarts)
  •  implicit remote() + list of executors

advanced_settings

* delay_server_sleep_time_in_seconds

* maximum_size_of_delay_queue_in_bytes

* number_of_concurrent_delay_rule_executors

* delay_rule_executors

upon complete / error

  • update / remove from iCAT
  • remove from in-memory set

*

*

*

*

iRODS Delay Server Migration - Design Goals

  • No irodsServer restarts required
    • large deployments are under continuous load
  • No double spends
    • only one irodsDelayServer per Zone at any time
  • Hands-free migration in case of disaster
    • if/when the irodsDelayServer isn't coming back
  • Visibility
    • easy to interrogate, debug

iRODS Delay Server Migration - Approach

  • Use the transactional database to store zone-wide information
    • single source of truth
  • Split roles of leader and successor
    • affords different code for servers in different roles
  • Identical algorithm running on all iRODS servers
    • each responsible for their own behavior
namespace option_name option_value
delay_server leader <hostname>
delay_server successor <hostname>

R_GRID_CONFIGURATION Table

iRODS Delay Server - Demo

$ hostname
05f4be918c0f

$ iadmin get_delay_server_info
{
    "leader": "other.server.example.org",
    "successor": ""
}

$ iadmin set_delay_server $(hostname)

$ iadmin get_delay_server_info
{
    "leader": "other.server.example.org",
    "successor": "05f4be918c0f"
}

$ iadmin get_delay_server_info
{
    "leader": "05f4be918c0f",
    "successor": ""
}

iRODS Delay Server Algorithm

if self == leader
    if successor defined and not self
        gracefully finish and exit
    else
        if necessary, start irodsDelayServer
else if self == successor
    run health check on leader
    if leader is not running
        promote self to leader in iCAT
    else
        save health stats
else
    if necessary, gracefully finish and exit
namespace option_name option_value
delay_server leader <hostname>
delay_server successor <hostname>

R_GRID_CONFIGURATION Table

$ iadmin set_delay_server <hostname>

iRODS Delay Server - Dark Alleys and Glory

Dark Alleys

  • database credentials / connection required
  • control plane as process / blocking ourselves
  • who is the parent process?

 

Glory

  • better PIDs
  • elegant solution

 

Future Work

  • remove database credentials requirement
  • detect and skip a redundant implicit remote()
  • advanced setting for sleep time between migration algorithm runs (0=never)

Questions?

Thank you!

UGM 2022 - iRODS Delay Server Migration

By iRODS Consortium

UGM 2022 - iRODS Delay Server Migration

  • 574