Terrell Russell, Executive Director
Kory Draughn, Chief Technologist
iRODS Consortium
Delay Server Migration
Delay Server Migration
July 5-8, 2022
iRODS User Group Meeting 2022
Leuven, Belgium
iRODS Delay Queue
- iRODS rules can be enqueued via 'delay()' to execute later
- Queued Rules live in the iCAT database and are processed by the irodsDelayServer in priority order.
- The irodsDelayServer sleeps most of the time, but spawns an irodsAgent every 30 seconds (by default) to check the Delay Queue for any delayed rules that need to be run.
iRODS Delay Server Architecture
4.2.5
- single producer, multiple consumer
- moved from processes to threads
- refactored to use in-memory set
4.2.8
- ported to use query processor
4.2.9
- rule context stored in iCAT
4.3.0
- cron-like facility (restarts)
- implicit remote() + list of executors
advanced_settings
* delay_server_sleep_time_in_seconds
* maximum_size_of_delay_queue_in_bytes
* number_of_concurrent_delay_rule_executors
* delay_rule_executors
upon complete / error
- update / remove from iCAT
- remove from in-memory set
*
*
*
*
iRODS Delay Server Migration - Design Goals
- No irodsServer restarts required
- large deployments are under continuous load
- No double spends
- only one irodsDelayServer per Zone at any time
- Hands-free migration in case of disaster
- if/when the irodsDelayServer isn't coming back
- Visibility
- easy to interrogate, debug
iRODS Delay Server Migration - Approach
- Use the transactional database to store zone-wide information
- single source of truth
- Split roles of leader and successor
- affords different code for servers in different roles
- Identical algorithm running on all iRODS servers
- each responsible for their own behavior
namespace | option_name | option_value |
---|---|---|
delay_server | leader | <hostname> |
delay_server | successor | <hostname> |
R_GRID_CONFIGURATION Table
iRODS Delay Server - Demo
$ hostname
05f4be918c0f
$ iadmin get_delay_server_info
{
"leader": "other.server.example.org",
"successor": ""
}
$ iadmin set_delay_server $(hostname)
$ iadmin get_delay_server_info
{
"leader": "other.server.example.org",
"successor": "05f4be918c0f"
}
$ iadmin get_delay_server_info
{
"leader": "05f4be918c0f",
"successor": ""
}
iRODS Delay Server Algorithm
if self == leader if successor defined and not self gracefully finish and exit else if necessary, start irodsDelayServer else if self == successor run health check on leader if leader is not running promote self to leader in iCAT else save health stats else if necessary, gracefully finish and exit
namespace | option_name | option_value |
---|---|---|
delay_server | leader | <hostname> |
delay_server | successor | <hostname> |
R_GRID_CONFIGURATION Table
$ iadmin set_delay_server <hostname>
iRODS Delay Server - Dark Alleys and Glory
Dark Alleys
- database credentials / connection required
- control plane as process / blocking ourselves
- who is the parent process?
Glory
- better PIDs
- elegant solution
Future Work
- remove database credentials requirement
- detect and skip a redundant implicit remote()
- advanced setting for sleep time between migration algorithm runs (0=never)
Questions?
Thank you!
UGM 2022 - iRODS Delay Server Migration
By iRODS Consortium
UGM 2022 - iRODS Delay Server Migration
- 583