Advanced Training:
Data to Compute
May 28-31, 2024
iRODS User Group Meeting 2024
Amsterdam, Netherlands
Alan King, Senior Software Developer
Martin Flores, Software Developer
iRODS Consortium
Integrating iRODS with a compute environment
In order of increasing complexity and integration...
iRODS as a compute orchestrator
- Launch a job via irule, or as part of a PEP
- Implement a Landing Zone for product capture
iRODS as part of a compute job script (TODAY)
- Stage the source data via replication for the application
- Capture the products and ingest them into iRODS
iRODS as part of the compute application
- Compute application directly leverages the iRODS API to open, read, and write data
The Data to Compute Use Case
Focus on the right side of the picture
iRODS is out of the data path for HPC computation
Goal - Develop generic interface concept for compute
-
Develop a metadata-driven interface to derive path for input data and compute results. Utilize it to:
- push data to the proper storage resource
- get a name for the host on which to launch compute job(s)
-
Separate configuration from implementation
- Keep deployment specifics in configuration files
- Keep rulebase, scripts, and modules free of hard-wired values
Goal - Develop a thumbnailing service for iRODS
Interface is through iRODS and SLURM (compute job scheduler):
- Replicate the data to the compute resource
- Send a job to the compute scheduler to generate thumbnails
- Register the thumbnails into the catalog
- Replicate the thumbnails back to long term storage
- Trim replicas on compute resource
Components of the System
System Component
Job Scheduler
Job Launching Script
Tools to Execute
Job Endpoint
Implementation
SLURM
bash
Image Magick convert
iRODS Rule Base
(user extension of the iRODS API)
and SLURM prolog / epilog
Getting Started
sudo apt-get update
sudo apt-get -y install python3-pip irods-rule-engine-plugin-python imagemagick
sudo su - irods
Install the PRC (Python iRODS-Client) module for irods.
As the ubuntu user ...
python3 -m pip install --upgrade pip
python3 -m pip install python-irodsclient
Install Image Magick, pip, and Python Rule Engine Plugin and switch to irods user.
Getting Started
As the ubuntu user... Get the irods_training repository
cd git clone https://github.com/irods/irods_training sudo apt-get -y install irods-externals-cmake3.21.4-0 irods-dev export PATH=/opt/irods-externals/cmake3.21.4-0/bin:$PATH
cd mkdir build_data_to_compute cd build_data_to_compute cmake ../irods_training/advanced/hpc_data_to_compute/ make package sudo dpkg -i irods-hpc-data-to-compute-example.deb
Install, configure, and start SLURM (job scheduler)
Build and Install the Data-to-Compute package
cd ~/irods_training/advanced/hpc_data_to_compute/ ./ubuntu_22/install_and_configure_slurm.sh sudo systemctl restart slurmd slurmctld
Package Contents
$ dpkg -c irods-hpc-data-to-compute-example.deb drwxrwxr-x root/root 0 2024-05-12 00:45 ./etc/ drwxrwxr-x root/root 0 2024-05-12 00:45 ./etc/irods/ -r--r--r-- root/root 1213 2024-05-12 00:40 ./etc/irods/core.py.data_to_compute -r--r--r-- root/root 1144 2024-05-12 00:40 ./etc/irods/data_to_compute.re drwxrwxr-x root/root 0 2024-05-12 00:45 ./var/ drwxrwxr-x root/root 0 2024-05-12 00:45 ./var/lib/ drwxrwxr-x root/root 0 2024-05-12 00:45 ./var/lib/irods/ drwxrwxr-x root/root 0 2024-05-12 00:45 ./var/lib/irods/compute/ -r--r--r-- root/root 0 2024-05-12 00:40 ./var/lib/irods/compute/__init__.py -rw------- root/root 59 2024-05-12 00:40 ./var/lib/irods/compute/admin_as_rodsuser.json -r--r--r-- root/root 11773 2024-05-12 00:40 ./var/lib/irods/compute/common.py -r--r--r-- root/root 2005 2024-05-12 00:40 ./var/lib/irods/compute/irods_compute_functions -r--r--r-- root/root 565 2024-05-12 00:40 ./var/lib/irods/compute/job_params.json -r--r--r-- root/root 2150 2024-05-12 00:40 ./var/lib/irods/compute/util.py -r--r--r-- root/root 1302 2024-05-12 00:40 ./var/lib/irods/detect_thumbnails.py drwxrwxr-x root/root 0 2024-05-12 00:45 ./var/lib/irods/msiExecCmd_bin/ -r-xr-xr-x root/root 383 2024-05-12 00:40 ./var/lib/irods/msiExecCmd_bin/convert.SLURM -r-xr-xr-x root/root 328 2024-05-12 00:40 ./var/lib/irods/msiExecCmd_bin/submit_thumbnail_job.sh -r--r--r-- root/root 788 2024-05-12 00:40 ./var/lib/irods/rescName_from_kvpair.r -r--r--r-- root/root 1494 2024-05-12 00:40 ./var/lib/irods/spawn_remote_slurm_jobs.r
Configure the rule engine
As the irods user, add an additional rule base to
/etc/irods/server_config.json :
"rule_engines": [
...
"re_rulebase_set": [
"data_to_compute",
"core"
],
...
]
Add in a small python rulebase:
As the irods user,
create an /etc/irods/core.py with the following content:
def pyParseRoleSpec (rule_args,callback,rei): compute_resc_spec = rule_args[0] kvp = compute_resc_spec.split('=')+[''] rule_args[1:3] = list(map(lambda x: x.strip(), kvp[:2]))
Python Rule Engine Configuration
Insert the python plugin configuration stanza into /etc/irods/server_config.json:
"rule_engines": [
{ "instance_name": "irods_rule_engine_plugin-irods_rule_language-instance",
...
},
{
"instance_name" : "irods_rule_engine_plugin-python-instance",
"plugin_name" : "irods_rule_engine_plugin-python",
"plugin_specific_configuration" : {}
},
Configure the LTS and Image Processing Resources
As the irods user:
Make two unix file system resources
iadmin mkresc lts_resc unixfilesystem $(hostname):/tmp/irods/lts_resc iadmin mkresc img_resc unixfilesystem $(hostname):/tmp/irods/img_resc
Annotate them with appropriate metadata given their roles
- defined in the configuration as part of the contract
imeta add -R lts_resc COMPUTE_RESOURCE_ROLE LONG_TERM_STORAGE imeta add -R img_resc COMPUTE_RESOURCE_ROLE IMAGE_PROCESSING
cp ~/irods_training/stickers.jpg /tmp
sudo mkdir -p /tmp/irods/thumbnails
sudo chown -R irods:irods /tmp/irods
As the ubuntu user:
Stage data and destination directory for thumbnail creation
The configuration interface
Define interfaces for any necessary conventions
-
Metadata attributes and values
-
Metadata values for implemented roles
-
Interface to job scheduler for launching compute
Single Point of Truth - allows for the use of the same 'end-points' for various metadata standards and naming conventions
Users may utilize metadata conventions to provide inputs to a given compute job
The configuration interface
For the thumbnail service we will need to
-
Get the metadata attribute string that holds the role
-
Get the tag for an Image Compute resource
-
Get the tag for a Long Term Storage resource
-
Get the logical collection name for thumbnails
-
Get the physical path for a thumbnail
-
Get the name of a thumbnail
-
Get a list of desired thumbnail sizes
The configuration interface
iRODS rule file provides interface for job submission
testRule { *thumbnail_sizes = "128x128,256x256,512x512,1024x1024" *host = "" *key = "COMPUTE_RESOURCE_ROLE"; *val="IMAGE_PROCESSING" # or even: # *key = "COMPUTE_RESOURCE_ROLE=IMAGE_PROCESSING"; *val="" *resc_name = "" get_host_and_resource_name_by_role(*host, *resc_name, *key, *val) writeLine ("stdout", "host=[*host] resc=[*resc_name]") writeLine ("stdout", "thumbnails to generate : [ *thumbnail_sizes ]") *input_file_name = "stickers" *input_file_ext = ".jpg" *input_file = "*input_file_name" ++ "*input_file_ext" if ("*host" == "" ) { writeLine ("stdout", "Host for job launch was not found.") } else { foreach (*x in select DATA_PATH where COLL_NAME = '/tempZone/home/rods' and DATA_NAME = '*input_file' and RESC_NAME = '*resc_name') { *src_phy_path = *x.DATA_PATH } remote(*host, "") { writeLine("serverLog", "-----> remoteExec on host [*host]:") foreach (*size in split (*thumbnail_sizes, ",")) { *dst_phy_path = "/tmp/irods/thumbnails/" ++ "*input_file_name" ++ "_thumbnail_" ++ "*size" ++ "*input_file_ext" writeLine("serverLog"," - thumbsize [ *size ]; convert ( *src_phy_path , *dst_phy_path )") *cmd_opts="/var/lib/irods/msiExecCmd_bin/convert.SLURM -thumbnail *size *src_phy_path *dst_phy_path" msiExecCmd("submit_thumbnail_job.sh","*cmd_opts","null","null","null",*OUT) } #foreach } #remote } #if-else } # end rule INPUT null OUTPUT ruleExecOut
The configuration interface
Abstraction of job submission via shell script
#!/bin/bash
# $1 - executable
# $2 - thumbnail option
# $3 - sizing string
# $4 - source physical path
# $5 - destination physical path
SBATCH_OPTIONS="-o /tmp/slurm-%j.out"
SCRIPT="$1" # assume full path to executable
/usr/local/bin/sbatch $SBATCH_OPTIONS "$SCRIPT" \
${2+"$2"} \
${3+"$3"} \
${4+"$4"} \
${5+"$5"} \
>/dev/null 2>&1
Thumbnail Service - testing
$ COMPUTE_RESC=$(iquest %s " select RESC_NAME where META_RESC_ATTR_NAME = \ 'COMPUTE_RESOURCE_ROLE' and META_RESC_ATTR_VALUE = 'IMAGE_PROCESSING' " )
$ iput -R ${COMPUTE_RESC} /tmp/stickers.jpg $ ils -l /tempZone/home/rods: rods 0 img_resc 2157087 2024-05-05.22:39 & stickers.jpg $ imkdir stickers_thumbnails $ irule -r irods_rule_engine_plugin-irods_rule_language-instance -F spawn_remote_slurm_jobs.r
$ squeue JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 4 debug convert. irods PD 0:00 1 (Resources) 5 debug convert. irods PD 0:00 1 (Priority) 2 debug convert. irods R 0:01 1 ip-172-31-44-38 3 debug convert. irods R 0:01 1 ip-172-31-44-38
$ squeue
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
$ iquest "%s : %s/%s" "select RESC_NAME,COLL_NAME,DATA_NAME where DATA_NAME like '%thumbnail%'" lts_resc : /tempZone/home/rods/stickers_thumbnails/stickers_thumbnail_1024x1024.jpg lts_resc : /tempZone/home/rods/stickers_thumbnails/stickers_thumbnail_128x128.jpg lts_resc : /tempZone/home/rods/stickers_thumbnails/stickers_thumbnail_256x256.jpg lts_resc : /tempZone/home/rods/stickers_thumbnails/stickers_thumbnail_512x512.jpg
(Wait for the SLURM job queue to be empty: )
As the irods user, position the input data and start the thumbnail jobs:
Extending iRODS with the Rule Engine
All rules should be created and tested in user space before being installed as a rule base
Rules may be refactored into a microservice plugin
Rules may be refactored into a C++ rule engine plugin
Rules may be refactored into an API plugin
Questions?
UGM 2024 - Data to Compute
By iRODS Consortium
UGM 2024 - Data to Compute
iRODS User Group Meeting 2024 - Advanced Training Module
- 211