Terrell Russell, Executive Director
Kory Draughn, Chief Technologist
iRODS Consortium

Delay Server Migration
Delay Server Migration
July 5-8, 2022
iRODS User Group Meeting 2022
Leuven, Belgium

iRODS Delay Queue
- iRODS rules can be enqueued via 'delay()' to execute later
- Queued Rules live in the iCAT database and are processed by the irodsDelayServer in priority order.
- The irodsDelayServer sleeps most of the time, but spawns an irodsAgent every 30 seconds (by default) to check the Delay Queue for any delayed rules that need to be run.

iRODS Delay Server Architecture
4.2.5
- single producer, multiple consumer
- moved from processes to threads
- refactored to use in-memory set
4.2.8
- ported to use query processor
4.2.9
- rule context stored in iCAT
4.3.0
- cron-like facility (restarts)
- implicit remote() + list of executors
advanced_settings
* delay_server_sleep_time_in_seconds
* maximum_size_of_delay_queue_in_bytes
* number_of_concurrent_delay_rule_executors
* delay_rule_executors

upon complete / error
- update / remove from iCAT
- remove from in-memory set
*
*
*
*

iRODS Delay Server Migration - Design Goals
- No irodsServer restarts required
- large deployments are under continuous load
- No double spends
- only one irodsDelayServer per Zone at any time
- Hands-free migration in case of disaster
- if/when the irodsDelayServer isn't coming back
- Visibility
- easy to interrogate, debug

iRODS Delay Server Migration - Approach
- Use the transactional database to store zone-wide information
- single source of truth
- Split roles of leader and successor
- affords different code for servers in different roles
- Identical algorithm running on all iRODS servers
- each responsible for their own behavior
| namespace | option_name | option_value |
|---|---|---|
| delay_server | leader | <hostname> |
| delay_server | successor | <hostname> |
R_GRID_CONFIGURATION Table

iRODS Delay Server - Demo
$ hostname
05f4be918c0f
$ iadmin get_delay_server_info
{
"leader": "other.server.example.org",
"successor": ""
}
$ iadmin set_delay_server $(hostname)
$ iadmin get_delay_server_info
{
"leader": "other.server.example.org",
"successor": "05f4be918c0f"
}
$ iadmin get_delay_server_info
{
"leader": "05f4be918c0f",
"successor": ""
}

iRODS Delay Server Algorithm
if self == leader
if successor defined and not self
gracefully finish and exit
else
if necessary, start irodsDelayServer
else if self == successor
run health check on leader
if leader is not running
promote self to leader in iCAT
else
save health stats
else
if necessary, gracefully finish and exit| namespace | option_name | option_value |
|---|---|---|
| delay_server | leader | <hostname> |
| delay_server | successor | <hostname> |
R_GRID_CONFIGURATION Table
$ iadmin set_delay_server <hostname>

iRODS Delay Server - Dark Alleys and Glory
Dark Alleys
- database credentials / connection required
- control plane as process / blocking ourselves
- who is the parent process?
Glory
- better PIDs
- elegant solution
Future Work
- remove database credentials requirement
- detect and skip a redundant implicit remote()
- advanced setting for sleep time between migration algorithm runs (0=never)

Questions?
Thank you!
UGM 2022 - iRODS Delay Server Migration
By iRODS Consortium
UGM 2022 - iRODS Delay Server Migration
- 858