Taking Data to Compute
March 26-28, 2018
RENCI iRODS Boot Camp
Chapel Hill, NC
Jason M. Coposky
@jason_coposky
Executive Director, iRODS Consortium
Taking Data to Compute
Integrating iRODS with a compute environment
In order of increasing complexity...
iRODS as a compute orchestrator
- Launch a job via irule, or as part of a PEP
- Implement a Landing Zone for product capture
iRODS as part of a compute job script
- Stage the source data via replication for the application
- Capture the products and ingest them into iRODS
iRODS as part of the compute application
- Compute application directly leverages the iRODS API to open, read, and write data
The Data to Compute Use Case
Focus on the right side of the picture
iRODS is out of the data path for computation
Goals - Develop generic interface concept for compute
-
Develop a metadata-driven interface for labeling resources which provide computational capabilities
- Ultimately relies upon convention
-
Separate configuration from implementation
- Isolate deployment-specific concepts
- Consider a rule base as an extension of iRODS
- Rules are not just data management policy
Goals - Develop a thumbnailing service for iRODS
Implemented as an iRODS rule base:
- Select an appropriate resource to replicate the data for compute
- Replicate the data to the compute resource
- Send a job to the compute scheduler to generate thumbnails
- Register the thumbnails into the catalog
- Replicate the thumbnails back to long term storage
- Trim replicas on compute resource
Components of the System
System Component
Job Scheduler
Job Launching Script
Tools to Execute
Job Endpoint
Implementation
User's Choice
bash
Image Magick convert
iRODS Rule Base
(user extension of the iRODS API)
Getting Started
Installing Image Magick
sudo apt-get update
sudo apt-get -y install imagemagick
Getting Started
Get the irods_training repository
git clone https://github.com/irods/irods_training sudo apt-get -y install irods-externals-* irods-dev export PATH=/opt/irods-externals/cmake3.5.2-0/bin/:$PATH
cd mkdir build_data_to_compute cd build_data_to_compute cmake ../irods_training/advanced/hpc_data_to_compute/ make package sudo dpkg -i ./irods-hpc-data-to-compute-example_4.2.2~trusty_amd64.deb
Build and Install MUNGE and SLURM (job scheduler)
Build and Install the Data to Compute package
cd irods_training/advanced/hpc_data_to_compute/ ./ubuntu14_install_munge_and_slurm.sh
Package Contents
$ dpkg -c ./irods-hpc-data-to-compute-example_4.2.2~trusty_amd64.deb drwxrwxr-x root/root 0 2018-03-22 02:37 ./etc/ drwxrwxr-x root/root 0 2018-03-22 02:37 ./etc/irods/ -r--r--r-- root/root 9694 2018-03-22 02:26 ./etc/irods/thumbnail.re -r--r--r-- root/root 1686 2018-03-22 02:26 ./etc/irods/thumbnail_configuration.re drwxrwxr-x root/root 0 2018-03-22 02:37 ./var/ drwxrwxr-x root/root 0 2018-03-22 02:37 ./var/lib/ drwxrwxr-x root/root 0 2018-03-22 02:37 ./var/lib/irods/ -r--r--r-- root/root 310 2018-03-22 02:26 ./var/lib/irods/create_thumbnails.r -r--r--r-- root/root 360 2018-03-22 02:26 ./var/lib/irods/find_thumbnails.r drwxrwxr-x root/root 0 2018-03-22 02:37 ./var/lib/irods/msiExecCmd_bin/ -r-xr-xr-x root/root 199 2018-03-22 02:26 ./var/lib/irods/msiExecCmd_bin/convert.SLURM -r-xr-xr-x root/root 343 2018-03-22 02:26 ./var/lib/irods/msiExecCmd_bin/submit_thumbnail_job.sh -r--r--r-- root/root 74 2018-03-22 02:26 ./var/lib/irods/msiExecCmd_bin/thumbnail.submit
Configure the rule engine
Add the two additional rule bases to /etc/irods/server_config.json
"rule_engines": [
...
"re_rulebase_set": [
"thumbnail_configuration",
"thumbnail",
"core"
],
...
]
Remember that order matters
Configure the LTS and Image Processing Resources
As the irods user:
Make two unix file system resources
iadmin mkresc lts_resc unixfilesystem `hostname`:/tmp/irods/lts_resc iadmin mkresc img_resc unixfilesystem `hostname`:/tmp/irods/img_resc
Annotate them with appropriate metadata given their roles
- defined in the configuration as part of the contract
imeta add -R lts_resc COMPUTE_RESOURCE_ROLE LONG_TERM_STORAGE imeta add -R img_resc COMPUTE_RESOURCE_ROLE IMAGE_PROCESSING
cp ~/irods_training/stickers.jpg /tmp
sudo mkdir -p /tmp/irods/thumbnails
sudo chown -R irods:irods /tmp/irods
As the ubuntu user:
Stage data and destination directory for thumbnail creation
The configuration interface
Define interfaces for any necessary conventions
-
Metadata attributes and values
-
Naming conventions for logical and physical paths
-
Metadata values for implemented roles
-
Interface to job scheduler for launching compute
Single Point of Truth - allows for the use of the same 'end-points' for various metadata standards and naming conventions
Users may utilize metadata conventions within a rule to provide inputs to a given compute job
The configuration interface
For the thumbnail service we will need to
-
Get the metadata attribute string that holds the role
-
Get the tag for an Image Compute resource
-
Get the tag for a Long Term Storage resource
-
Get the logical collection name for thumbnails
-
Get the physical path for a thumbnail
-
Get the name of a thumbnail
-
Get a list of desired thumbnail sizes
The configuration interface
Provide an interface for our chosen metadata convention
get_compute_resource_role_attribute(*t) { *t = "COMPUTE_RESOURCE_ROLE" } get_image_compute_type(*t) { *t = "IMAGE_PROCESSING" } get_long_term_storage_type(*t) { *t = "LONG_TERM_STORAGE"
}
The configuration interface
Provide an interface for job submission
submit_thumbnail_job(*server_host, *size_str, *src_phy_path, *dst_phy_path ) {
remote(*server_host, "") {
*cmd_opt = '/var/lib/irods/msiExecCmd_bin/convert.SLURM -thumbnail *size_str *src_phy_path *dst_phy_path'
*err = errormsg(msiExecCmd(
"submit_thumbnail_job.sh",
*cmd_opt, "null", "null", "null", *std_out_err), *msg);
msiGetStdoutInExecCmdOut(*std_out_err,*std_out);
msiGetStderrInExecCmdOut(*std_out_err,*std_err);
if(*err != 0) {
writeLine( "serverLog", "FAILED: [*cmd_opt] [*err] [*msg]" );
failmsg(*err,*cmd_opt)
}
} # remote
}
The configuration interface
Provide an interface for naming conventions
get_thumbnail_collection_name(*col_name, *obj_name, *thumb_coll_name) { *fn = trimr(*obj_name, ".") *thumb_coll_name = *col_name ++ "/" ++ *fn ++ "_thumbnails" } get_thumbnail_physical_path(*dst_dir, *thumb_name, *phy_path) { *phy_path = *dst_dir ++ "/" ++ *thumb_name } get_thumbnail_name(*file_name, *size, *thumb_name) { # trim the extension *fn = trimr(*file_name, ".") *ext = substr(*file_name, strlen(*fn)+1, strlen(*file_name)) *thumb_name = *fn ++ "_thumbnail_" ++ *size ++ "." ++ *ext } get_thumbnail_sizes(*size_list) { *size_list = list( "128x128", "256x256", "512x512", "1024x1024" ) }
The configuration interface
Abstraction of job submission via shell script
#!/bin/bash
# $1 - executable
# $2 - thumbnail option
# $3 - sizing string
# $4 - source physical path
# $5 - destination physical path
SBATCH_OPTIONS="-o /tmp/slurm-%j.out"
SCRIPT="$1" # assume full path to executable
/usr/local/bin/sbatch $SBATCH_OPTIONS "$SCRIPT" \
${2+"$2"} \
${3+"$3"} \
${4+"$4"} \
${5+"$5"} \
>/dev/null 2>&1
Thumbnail Service - helper functions
Local functions to simplify queries and complex operations
split_path(*p, *tok, *col, *obj) get_resource_name_by_role(*resc_name, *attr, *value) get_resource_name_by_id(*resc_id, *resc_name) get_resource_id_by_name(*resc_name, *resc_id) get_resource_host_by_id(*resc_id, *resc_host) get_resc_id_for_data_object_resident_on_image_node(*obj_name, *col_name, *compute_resc_role_attr, *image_compute_type, *src_resc_id) get_phy_path_for_object_on_resc_id(*obj_name, *resc_id, *phy_path)
Thumbnail Service - helper function implementation
split_path(*p, *tok, *col, *obj) { *col = trimr(*p, *tok) *obj = substr(*p, strlen(*col)+1, strlen(*p)) } get_resource_name_by_role(*resc_name, *attr, *value) { *resc_name = "NULL" foreach(*row in SELECT DATA_RESC_NAME WHERE META_RESC_ATTR_NAME = '*attr' AND META_RESC_ATTR_VALUE = '*value') { *resc_name = *row.DATA_RESC_NAME } # foreach } get_resource_name_by_id(*resc_id, *resc_name) { *resc_name = "NULL" foreach(*row in SELECT RESC_NAME WHERE RESC_ID = '*resc_id') { *resc_name = *row.RESC_NAME } # foreach } get_resource_id_by_name(*resc_name, *resc_id) { *resc_id = "NULL" foreach(*row in SELECT RESC_ID WHERE RESC_NAME = '*resc_name') { *resc_id = *row.RESC_ID } # foreach } get_resource_host_by_id(*resc_id, *resc_host) { *resc_host = "NULL" foreach(*row in SELECT RESC_LOC WHERE RESC_ID = '*resc_id') { *resc_host = *row.RESC_LOC } # foreach }
Thumbnail Service - helper function implementation
get_resc_id_for_data_object_resident_on_image_node( *obj_name, *col_name,
*compute_resc_role_attr, *image_compute_type, *src_resc_id) {
*src_resc_id = "NULL"
*image_resc_id = "NOT_FOUND"
foreach(*row in SELECT DATA_RESC_ID WHERE DATA_NAME = '*obj_name' AND COLL_NAME = '*col_name') {
*id = *row.DATA_RESC_ID
foreach(*v in SELECT META_RESC_ATTR_VALUE WHERE RESC_ID = '*id' and
META_RESC_ATTR_NAME = '*compute_resc_role_attr' ) {
if(*image_compute_type == *v.META_RESC_ATTR_VALUE) {
*image_resc_id = *id
break
}
} # values
} # out_ids
}
get_phy_path_for_object_on_resc_id(*obj_name, *resc_id, *phy_path) {
*phy_path = "NULL"
foreach(*row in SELECT DATA_PATH WHERE DATA_NAME = '*obj_name' AND RESC_ID = '*resc_id') {
*phy_path = *row.DATA_PATH;
}
}
Thumbnail Service - helper functions
Find the resource tagged for image processing and replicate the data object
replicate_object_to_image_node( *src_obj_path, *compute_resc_role_attr, *image_compute_type, *img_resc_name, *src_resc_id ) { get_resource_name_by_role( *img_resc_name, *compute_resc_role_attr, *image_compute_type); if("NULL" == *img_resc_name) { failmsg(-1,"get_resource_name_by_role failed [*lts_resc_name][*compute_resc_role_attr][*image_compute_type]") } # "Take the Data to the Compute" - replicate to an image compute node *err = errormsg(msiDataObjRepl( *src_obj_path, "destRescName=*img_resc_name", *out_param), *msg) if(0 != *err) { failmsg(*err, "msiDataObjRepl failed for [*src_obj_path] [*img_resc_name] - [*out_param]") } *src_resc_id = "NULL" # set the src resc id to the new image compute node id get_resource_id_by_name(*img_resc_name, *src_resc_id) }
Thumbnail Service - helper functions
register_and_replicate_thumbnail(*server_host, *obj_path, *src_resc_name, *phy_path, *dst_resc_name) { delay( "<EF>5s REPEAT UNTIL SUCCESS</EF>") { remote(*server_host, "") { *long_term_resource = "demoResc" writeLine("serverLog", "register_and_replicate_thumbnail :: [*obj_path] [*src_resc_name] [*phy_path] [*dst_resc_name]"); *err = errormsg(msiPhyPathReg(*obj_path, *src_resc_name, *phy_path, "null", *status), *msg); if(0 != *err) { failmsg(*err, "msiPhyPathReg failed for [*obj_path] [*src_resc_name] [*phy_path] [*status]") } *err = errormsg(msiDataObjRepl( *obj_path, "destRescName=*dst_resc_name", *out_param), *msg) if(0 != *err) { failmsg(*err, "msiDataObjRepl failed for [*obj_path] [*dst_resc_name] - [*out_param]") } *err = errormsg(msiDataObjUnlink( "objPath=*obj_path++++replNum=0++++unreg=", *out_param), *msg) if(0 != *err) { failmsg(*err, "msiDataObjUnlink failed for [*obj_path] [*out_param]") } } # remote } }
Basic LandingZone - Register a thumbnail and replicate it to another resource then trim the source replica
Thumbnail Service - the function interface
Functions which rely on the configuration abstraction to do the work of generating the thumbnails
create_thumbnail_collection(*src_obj_path, *dst_phy_dir) create_thumbnail(*src_obj_path, *dst_obj_path, *dst_phy_path, *size_str) create_thumbnail_impl(*src_obj_path, *dst_obj_path, *dst_phy_path, *size_str) get_list_of_thumbnails(*src_obj_path, *thumbnail_list)
These functions represent an extension of the iRODS API
- First prototyped via the rule engines
- Later implemented via the plugin interface
Thumbnail Service - the function interface
create_thumbnail_collection(*src_obj_path, *dst_phy_dir) { split_path(*src_obj_path, "/", *col_name, *obj_name *thumb_coll_name = "NULL" get_thumbnail_collection_name(*col_name, *obj_name, *thumb_coll_name); writeLine( "serverLog", "XXXX - thumb_coll_name [*thumb_coll_name]" ) *err = errormsg(msiCollCreate(*thumb_coll_name, 1, *out), *msg) if( *err < 0 ) { writeLine("serverLog", "msiCollCreate failed: [*err] [*msg] [*out]") failmsg(*err, *msg) } get_thumbnail_sizes(*thumb_sizes) foreach( *sz in *thumb_sizes ) { get_thumbnail_name(*obj_name, *sz, *thumbnail_name); *dst_obj_path = *thumb_coll_name ++ "/" ++ *thumbnail_name writeLine( "serverLog", "XXXX - [*src_obj_path] [*sz] [*thumbnail_name] [*dst_obj_path]" ) *dst_phy_path = "NULL" get_thumbnail_physical_path(*dst_phy_dir, *thumbnail_name, *dst_phy_path) create_thumbnail( *src_obj_path, *dst_obj_path, *dst_phy_path, *sz) } }
Thumbnail Service - interface functions
Text
create_thumbnail(*src_obj_path, *dst_obj_path, *dst_phy_path, *size_str) { *err = errormsg(msiObjStat(*dst_obj_path,*obj_stat), *msg); if(0 != *err) { writeLine("serverLog", "msiObjStat failed for [*dst_obj_path] [*err]") } else {
create_thumbnail_impl(
*src_obj_path,
*dst_obj_path,
*dst_phy_path,
*size_str );
}
}
Thumbnail Service - interface functions
get_list_of_thumbnails(*src_obj_path, *thumbnail_list) { *thumbnail_list = list() split_path(*src_obj_path, "/", *col_name, *obj_name) # derive a collection name from the logical path *thumb_coll_name = "NULL" get_thumbnail_collection_name( *col_name, *obj_name, *thumb_coll_name) # get the list of possible sizes get_thumbnail_sizes(*thumb_sizes) foreach( *sz in *thumb_sizes ) { get_thumbnail_name(*obj_name, *sz, *thumbnail_name); *dst_obj_path = *thumb_coll_name ++ "/" ++ *thumbnail_name # does the thumbnail exist *err = errormsg(msiObjStat(*dst_obj_path,*obj_stat), *msg) if( 0 == *err ) { # it does exist, add it to the list *thumbnail_list = cons(*dst_obj_path, *thumbnail_list) } } }
Thumbnail Service - interface functions
create_thumbnail_impl(*src_obj_path, *dst_obj_path, *dst_phy_path, *size_str) { split_path(*src_obj_path, "/", *col_name, *obj_name) # capture configuration parameters *image_compute_type = "NULL" get_image_compute_type(*image_compute_type) if("NULL" == *image_compute_type) { failmsg(-1,"get_image_compute_type failed") } writeLine("serverLog", "image_compute_type [*image_compute_type]") *lts_compute_type = "NULL" get_long_term_resc_type(*lts_compute_type) if("NULL" == *lts_compute_type) { failmsg(-1,"get_long_term_resc_type failed") } writeLine("serverLog", "lts_compute_type [*lts_compute_type]") *compute_resc_role_attr = "NULL" get_compute_resource_role_attribute(*compute_resc_role_attr) if("NULL" == *compute_resc_role_attr) { failmsg(-1,"get_compute_resource_role_attribute failed") } writeLine("serverLog", "compute_resc_role_attr [*compute_resc_role_attr]")
1 - 4 :: capture metadata
Thumbnail Service - interface functions
2 - 4 :: determine resources and replicate to compute
*lts_resc_name = "NULL"
get_resource_name_by_role( *lts_resc_name, *compute_resc_role_attr, *lts_compute_type); if("NULL" == *lts_resc_name) { failmsg(-1,"get_resource_name_by_role failed [*lts_resc_name][*compute_resc_role_attr][*lts_compute_type]") } writeLine("serverLog", "lts_resc_name [*lts_resc_name]") *src_resc_id = "NULL" get_resc_id_for_data_object_resident_on_image_node( *obj_name, *col_name, *compute_resc_role_attr, *image_compute_type, *src_resc_id) writeLine("serverLog", "src_resc_id [*src_resc_id]") *img_resc_name = "NULL" # does data object not reside on the required resource? if("NULL" == *src_resc_id) { replicate_object_to_image_node( *src_obj_path, *compute_resc_role_attr, *image_compute_type, *img_resc_name, *src_resc_id ) if("NULL" == *src_resc_id) { failmsg(-1, "get_resource_id_by_name failed for [*img_resc_name]") } } writeLine("serverLog", "src_resc_id [*src_resc_id]")
Thumbnail Service - interface functions
3 - 4 :: determine resource parameters for the job
*src_resc_name = "NULL"
get_resource_name_by_id(*src_resc_id, *src_resc_name) if("NULL" == *src_resc_name) { failmsg(-1,"get_resource_name_by_id failed for [*src_resc_id]") } writeLine("serverLog", "src_resc_name [*src_resc_name]") *src_phy_path = "NULL" get_phy_path_for_object_on_resc_id(*obj_name, *src_resc_id, *src_phy_path) if("NULL" == *src_phy_path) { failmsg(-1,"failed for [*obj_name] [*src_resc_id]") } writeLine("serverLog", "src_phy_path [*src_phy_path]") *server_host = "NULL" get_resource_host_by_id(*src_resc_id, *server_host); if("NULL" == *server_host) { failmsg(-1,"get_resource_host_by_id failed for [*src_resc_id]") } writeLine("serverLog", "server_host [*server_host]")
Thumbnail Service - interface functions
4 - 4 :: launch job and capture products
# launch image computation job submit_thumbnail_job( *server_host, *size_str, *src_phy_path, *dst_phy_path) # launch registration and replication register_and_replicate_thumbnail( *server_host, *dst_obj_path, *src_resc_name, *dst_phy_path, *lts_resc_name); }
Thumbnail Service - user space invocation
create_thumbnails { *src_obj_path="/tempZone/home/rods/stickers.jpg" *dst_phy_dir="/tmp/irods/thumbnails" *err = errormsg(create_thumbnail_collection(*src_obj_path, *dst_phy_dir), *msg) if(0 != *err) { writeLine( "stdout", "FAIL: [*err] [*msg]") } } INPUT null OUTPUT ruleExecOut
create_thumbnails.r
Thumbnail Service - user space invocation
find_thumbnails { *thb_list = list() *err = errormsg(get_list_of_thumbnails(*src_obj_path, *thb_list), *msg) if(0 != *err) { writeLine( "stdout", "FAIL: [*err] [*msg]") } foreach( *t in *thb_list ) { writeLine("stdout", "thumbnail [*t]") } } INPUT *src_obj_path="/tempZone/home/rods/stickers.jpg" OUTPUT ruleExecOut
find_thumbnails.r
Thumbnail Service - testing
irods@icat:~$ iput /tmp/stickers.jpg irods@icat:~$ ils -l /tempZone/home/rods: rods 0 demoResc 2157087 2018-03-22.18:42 & stickers.jpg
irods@icat:~$ irule -F create_thumbnails.r irods@icat:~$ ils -l /tempZone/home/rods: rods 0 demoResc 2157087 2018-03-22.18:42 & stickers.jpg rods 1 img_resc 2157087 2018-03-22.18:43 & stickers.jpg C- /tempZone/home/rods/stickers_thumbnails
irods@icat:~$ iqstat
... irods@icat:~$ ils -l /tempZone/home/rods/stickers_thumbnails /tempZone/home/rods/stickers_thumbnails: rods 1 lts_resc 229954 2018-03-22.18:43 & stickers_thumbnail_1024x1024.jpg rods 1 lts_resc 6456 2018-03-22.18:43 & stickers_thumbnail_128x128.jpg rods 1 lts_resc 19355 2018-03-22.18:43 & stickers_thumbnail_256x256.jpg rods 1 lts_resc 63036 2018-03-22.18:43 & stickers_thumbnail_512x512.jpg irods@icat:~$ irule -F find_thumbnails.r thumbnail [/tempZone/home/rods/stickers_thumbnails/stickers_thumbnail_1024x1024.jpg] thumbnail [/tempZone/home/rods/stickers_thumbnails/stickers_thumbnail_512x512.jpg] thumbnail [/tempZone/home/rods/stickers_thumbnails/stickers_thumbnail_256x256.jpg] thumbnail [/tempZone/home/rods/stickers_thumbnails/stickers_thumbnail_128x128.jpg]
Extending iRODS with the Rule Engine
All rules should be created and tested in user space before being installed as a rule base
Rules may be refactored into a microservice plugin
Rules may be refactored into a C++ rule engine plugin
Rules may be refactored into an API plugin
RENCI Boot Camp - Taking Data to Compute
By iRODS Consortium
RENCI Boot Camp - Taking Data to Compute
Training to accompany the one page data management design pattern: https://irods.org/images/data_to_compute.jpg
- 2,375