Taking Data to Compute

June 13-15, 2017

iRODS User Group Meeting 2017

Utrecht, Netherlands

Jason M. Coposky

@jason_coposky

Executive Director, iRODS Consortium

Taking Data to Compute

Integrating iRODS with a compute environment

iRODS as a compute orchestrator

  • Launch a job via irule, or as part of a PEP
  • Implement a Landing Zone for product capture

iRODS as part of a compute job script

  • Stage the source data via replication for the application
  • Capture the products and ingest them into iRODS 

iRODS as part of the compute application

  • Compute application directly leverages the iRODS API to open, read and write data

In order of increasing complexity

The Data to Compute Use Case

Focus on the right side of the picture

iRODS is out of the data path for computation

Goals - Develop generic interface concept for compute

  • Develop a metadata driven interface for labeling resources which provide computational capabilities - ultimately relies upon convention
  • Separate configuration from implementation - isolate deployment specific concepts
  • Consider a rule base as an extension of iRODS - rules are not just data management policy

Goals - Develop a thumbnailing service for iRODS

  1. Select an appropriate resource to replicate the data for compute
  2. Replicate the data to the compute resource
  3. Send a job to the compute scheduler to generate thumbnails
  4. Register the thumbnails into the catalog
  5. Replicate the thumbnails back to long term storage
  6. Trim replicas on compute resource

Implemented as an iRODS rule base

Components of the System

System Component

Job Scheduler

Job Launching Script

Tools to Execute

Job Endpoint

Implementation

User's Choice

bash

Image Magick convert

iRODS Rule Base

(user extension of the iRODS API)

Getting Started

Installing Condor Repositories

wget -qO - http://research.cs.wisc.edu/htcondor/ubuntu/HTCondor-Release.gpg.key | sudo apt-key add -
echo "deb [arch=amd64] http://research.cs.wisc.edu/htcondor/ubuntu/stable/ $(lsb_release -sc) contrib" | sudo tee /etc/apt/sources.list.d/condor.list

echo "deb [arch=amd64] http://research.cs.wisc.edu/htcondor/ubuntu/development/ $(lsb_release -sc) contrib" | sudo tee /etc/apt/sources.list.d/condor.list

Installing Condor and Image Magick

sudo apt-get update
sudo apt-get -y install condor imagemagick
sudo service condor restart
condor_status

Getting Started

Build and Install the Data to Compute package

git clone https://github.com/irods/irods_training
sudo apt-get -y install irods-externals-* irods-dev
export PATH=/opt/irods-externals/cmake3.5.2-0/bin/:$PATH
mkdir build_data_to_compute
cd build_data_to_compute
cmake ../irods_training/advanced/hpc_data_to_compute/
make package
sudo dpkg -i ./irods-hpc-data-to-compute-example_4.2.1~trusty_amd64.deb

Package Contents

$ dpkg -c ./irods-hpc-data-to-compute-example_4.2.1~trusty_amd64.deb
drwxr-xr-x root/root         0 2017-06-07 18:03 ./etc/
drwxr-xr-x root/root         0 2017-06-07 18:03 ./etc/irods/
-r--r--r-- root/root      9586 2017-06-07 16:58 ./etc/irods/thumbnail.re
-r--r--r-- root/root      1656 2017-06-07 16:58 ./etc/irods/thumbnail_configuration.re
drwxr-xr-x root/root         0 2017-06-07 18:03 ./var/
drwxr-xr-x root/root         0 2017-06-07 18:03 ./var/lib/
drwxr-xr-x root/root         0 2017-06-07 18:03 ./var/lib/irods/
-r--r--r-- root/root       298 2017-06-07 16:58 ./var/lib/irods/create_thumbnails.r
-r--r--r-- root/root       360 2017-06-07 16:58 ./var/lib/irods/find_thumbnails.r
drwxr-xr-x root/root         0 2017-06-07 18:03 ./var/lib/irods/msiExecCmd_bin/
-r-xr--r-- root/root       335 2017-06-07 16:58 ./var/lib/irods/msiExecCmd_bin/submit_thumbnail_job.sh
-r--r--r-- root/root        70 2017-06-07 16:58 ./var/lib/irods/msiExecCmd_bin/thumbnail.submit

Configure the rule engine

Add the two additional rule bases to /etc/irods/server_config.json

"rule_engines": [

    ...

        "re_rulebase_set": [

            "thumbnail_configuration",

            "thumbnail",

           "core"

        ],

    ...

]

Remember that order matters

Configure the LTS and Image Processing Resources

As the irods user:

 

Make two unix file system resources

iadmin mkresc lts_resc unixfilesystem `hostname`:/tmp/irods/lts_resc
iadmin mkresc img_resc unixfilesystem `hostname`:/tmp/irods/img_resc

Annotate them with appropriate metadata given their roles

  - defined in the configuration as part of the contract

imeta add -R lts_resc COMPUTE_RESOURCE_ROLE LONG_TERM_STORAGE
imeta add -R img_resc COMPUTE_RESOURCE_ROLE IMAGE_PROCESSING
cp ~/irods_training/stickers.jpg /tmp
sudo mkdir -p /tmp/irods/thumbnails
sudo chown -R irods:irods /tmp/irods

As the ubuntu user:

 

Stage data and destination directory for thumbnail creation

The configuration interface

Define interfaces for any necessary conventions

  • Metadata attributes and values

  • Naming conventions for logical and physical paths

  • Metadata values for implemented roles

  • Interface to job scheduler for launching compute

Single Point of Truth - allows for the use of the same

'end-points' for various metadata standards and naming conventions

Users may utilize metadata conventions within a rule to provide inputs to a given compute job

The configuration interface

For the thumbnail service we will need to

  • Get the metadata attribute string that holds the role

  • Get the tag for an Image Compute resource

  • Get the tag for a Long Term Storage resource

  • Get the logical collection name for thumbnails

  • Get the physical path for a thumbnail

  • Get the name of a thumbnail

  • Get a list of desired thumbnail sizes

The configuration interface

Provide an interface for our chosen metadata convention

get_compute_resource_role_attribute(*t) {
    *t = "COMPUTE_RESOURCE_ROLE"
}
get_image_compute_type(*t) {
    *t = "IMAGE_PROCESSING"
}
get_long_term_storage_type(*t) {
    *t = "LONG_TERM_STORAGE"
}

The configuration interface

Provide an interface for job submission

submit_thumbnail_job(*server_host, *size_str, *src_phy_path, *dst_phy_path ) {
    remote(*server_host, "") {
        *cmd_opt = '/usr/bin/convert -thumbnail *size_str *src_phy_path *dst_phy_path'
        *err = errormsg(msiExecCmd(
                           "submit_thumbnail_job.sh",
                           *cmd_opt, "null", "null", "null", *std_out_err), *msg);
        msiGetStdoutInExecCmdOut(*std_out_err,*std_out);
        
        msiGetStderrInExecCmdOut(*std_out_err,*std_err);
       
        if(*err != 0) {
            writeLine( "serverLog", "FAILED: [*cmd_opt] [*err] [*msg]" );
            failmsg(*err,*cmd_opt)
        }
    } # remote
}

The configuration interface

Provide an interface for naming conventions

get_thumbnail_collection_name(*col_name, *obj_name,  *thumb_coll_name) {
    *fn = trimr(*obj_name, ".")
    *thumb_coll_name = *col_name ++ "/" ++ *fn ++ "_thumbnails"
}
get_thumbnail_physical_path(*dst_dir, *thumb_name, *phy_path) {
    *phy_path = *dst_dir ++ "/" ++ *thumb_name
}
get_thumbnail_name(*file_name, *size, *thumb_name) {
    # trim the extension
    *fn = trimr(*file_name, ".")
    *ext = substr(*file_name, strlen(*fn)+1, strlen(*file_name))
    *thumb_name = *fn ++ "_thumbnail_" ++ *size ++ "." ++ *ext
}
get_thumbnail_sizes(*size_list) {
    *size_list = list( "128x128", "256x256", "512x512", "1024x1024" )
}

The configuration interface

Abstraction of job submission via shell script

#!/bin/bash
# $1 - executable
# $2 - thumbnail option
# $3 - sizing string
# $4 - source physical path
# $5 - destination physical path
/usr/bin/condor_submit /var/lib/irods/msiExecCmd_bin/thumbnail.submit -append "executable ${1}" -append "arguments ${2} ${3} ${4} ${5}"

Thumbnail Service - helper functions

Local functions to simplify queries and complex operations

split_path(*p, *tok, *col, *obj)
get_resource_name_by_role(*resc_name, *attr, *value)
get_resource_name_by_id(*resc_id, *resc_name)
get_resource_id_by_name(*resc_name, *resc_id)
get_resource_host_by_id(*resc_id, *resc_host)
get_resc_id_for_data_object_resident_on_image_node(*obj_name, *col_name,
                  *compute_resc_role_attr, *image_compute_type, *src_resc_id)
get_phy_path_for_object_on_resc_id(*obj_name, *resc_id, *phy_path)

Thumbnail Service - helper function implementation

split_path(*p, *tok, *col, *obj) {
    *col = trimr(*p, *tok)
    *obj = substr(*p, strlen(*col)+1, strlen(*p))
}

get_resource_name_by_role(*resc_name, *attr, *value) {
    *resc_name = "NULL"
    foreach(*row in SELECT DATA_RESC_NAME WHERE META_RESC_ATTR_NAME = '*attr' AND 
                    META_RESC_ATTR_VALUE = '*value') {
        *resc_name = *row.DATA_RESC_NAME
    } # foreach
}

get_resource_name_by_id(*resc_id, *resc_name) {
    *resc_name = "NULL"
    foreach(*row in SELECT RESC_NAME WHERE RESC_ID = '*resc_id') {
        *resc_name = *row.RESC_NAME
    } # foreach
}

get_resource_id_by_name(*resc_name, *resc_id) {
    *resc_id = "NULL"
    foreach(*row in SELECT RESC_ID WHERE RESC_NAME = '*resc_name') {
        *resc_id = *row.RESC_ID
    } # foreach
}

get_resource_host_by_id(*resc_id, *resc_host) {
    *resc_host = "NULL"
    foreach(*row in SELECT RESC_LOC WHERE RESC_ID = '*resc_id') {
        *resc_host = *row.RESC_LOC
    } # foreach
}

Thumbnail Service - helper function implementation

get_resc_id_for_data_object_resident_on_image_node( *obj_name, *col_name,    
    *compute_resc_role_attr, *image_compute_type, *src_resc_id) {
    *src_resc_id = "NULL"
    *image_resc_id = "NOT_FOUND"
    foreach(*row in SELECT DATA_RESC_ID WHERE DATA_NAME = '*obj_name' AND COLL_NAME = '*col_name') {
        *id = *row.DATA_RESC_ID
        foreach(*v in SELECT META_RESC_ATTR_VALUE WHERE RESC_ID = '*id' and 
                META_RESC_ATTR_NAME = '*compute_resc_role_attr' ) {
            if(*image_compute_type == *v.META_RESC_ATTR_VALUE) {
                *image_resc_id = *id
                break
            }
        } # values
    } # out_ids
}

get_phy_path_for_object_on_resc_id(*obj_name, *resc_id, *phy_path) {
    *phy_path = "NULL"
    foreach(*row in SELECT DATA_PATH WHERE DATA_NAME = '*obj_name' AND RESC_ID = '*resc_id') {
        *phy_path = *row.DATA_PATH;
    }
}

Thumbnail Service - helper functions

Find the resource tagged for image processing and replicate the data object

replicate_object_to_image_node(
    *src_obj_path,
    *compute_resc_role_attr,
    *image_compute_type,
    *img_resc_name,
    *src_resc_id ) {
    get_resource_name_by_role(
            *img_resc_name,
            *compute_resc_role_attr,
            *image_compute_type);
    if("NULL" == *img_resc_name) {
        failmsg(-1,"get_resource_name_by_role failed [*lts_resc_name][*compute_resc_role_attr][*image_compute_type]")
    }
    # "Take the Data to the Compute" - replicate to an image compute node
    *err = errormsg(msiDataObjRepl(
                   *src_obj_path,
                   "destRescName=*img_resc_name",
                   *out_param), *msg)
    if(0 != *err) {
        failmsg(*err, "msiDataObjRepl failed for [*src_obj_path] [*img_resc_name] - [*out_param]")
    }
    *src_resc_id = "NULL"
    # set the src resc id to the new image compute node id
    get_resource_id_by_name(*img_resc_name, *src_resc_id)
}

Thumbnail Service - helper functions

register_and_replicate_thumbnail(*server_host, *obj_path, *src_resc_name, *phy_path, *dst_resc_name) {
    delay( "<EF>5s REPEAT UNTIL SUCCESS</EF>") {
        remote(*server_host, "") {
            *long_term_resource = "demoResc"

            writeLine("serverLog", "register_and_replicate_thumbnail :: [*obj_path] [*src_resc_name] [*phy_path] [*dst_resc_name]");
            *err = errormsg(msiPhyPathReg(*obj_path, *src_resc_name, *phy_path, "null", *status), *msg);
            if(0 != *err) {
                failmsg(*err, "msiPhyPathReg failed for [*obj_path] [*src_resc_name] [*phy_path] [*status]")
            }

            *err = errormsg(msiDataObjRepl(
                       *obj_path,
                       "destRescName=*dst_resc_name",
                       *out_param), *msg)
            if(0 != *err) {
                failmsg(*err, "msiDataObjRepl failed for [*obj_path] [*dst_resc_name] - [*out_param]")
            }

            *err = errormsg(msiDataObjUnlink(
                       "objPath=*obj_path++++replNum=0++++unreg=",
                       *out_param), *msg)
            if(0 != *err) {
                failmsg(*err, "msiDataObjUnlink failed for [*obj_path] [*out_param]")
            }
        } # remote
    }
}

Basic LandingZone - Register a thumbnail and replicate it to another resource then trim the source replica

Thumbnail Service - the function interface

Functions which rely on the configuration abstraction to do the work of generating the thumbnails

create_thumbnail_collection(*src_obj_path, *dst_phy_dir)

create_thumbnail(*src_obj_path, *dst_obj_path, *dst_phy_path, *size_str)

create_thumbnail_impl(*src_obj_path, *dst_obj_path, *dst_phy_path, *size_str)

get_list_of_thumbnails(*src_obj_path, *thumbnail_list)

These functions represent an extension of the iRODS API

  • First prototyped via the rule engines
  • Later implemented via the plugin interface

Thumbnail Service - the function interface

create_thumbnail_collection(*src_obj_path, *dst_phy_dir) {
    split_path(*src_obj_path, "/", *col_name, *obj_name

    *thumb_coll_name = "NULL"
    get_thumbnail_collection_name(*col_name, *obj_name, *thumb_coll_name);
    writeLine( "serverLog", "XXXX - thumb_coll_name [*thumb_coll_name]" )

    *err = errormsg(msiCollCreate(*thumb_coll_name, 1, *out), *msg)
    if( *err < 0 ) {
        writeLine("serverLog", "msiCollCreate failed: [*err] [*msg] [*out]")
        failmsg(*err, *msg)
    }

    get_thumbnail_sizes(*thumb_sizes)
    foreach( *sz in *thumb_sizes ) {
        get_thumbnail_name(*obj_name, *sz, *thumbnail_name);
        *dst_obj_path = *thumb_coll_name ++ "/" ++ *thumbnail_name
        writeLine( "serverLog", "XXXX - [*src_obj_path] [*sz] [*thumbnail_name] [*dst_obj_path]" )

        *dst_phy_path = "NULL"
        get_thumbnail_physical_path(*dst_phy_dir, *thumbnail_name, *dst_phy_path)

        create_thumbnail(
            *src_obj_path,
            *dst_obj_path,
            *dst_phy_path,
            *sz)
    }
}

Thumbnail Service - interface functions

Text

create_thumbnail(*src_obj_path, *dst_obj_path, *dst_phy_path, *size_str) {
    *err = errormsg(msiObjStat(*dst_obj_path,*obj_stat), *msg);
    if(0 != *err) {
        writeLine("serverLog", "msiObjStat failed for [*dst_obj_path] [*err]")
    }
    else {
        create_thumbnail_impl(
            *src_obj_path,
            *dst_obj_path,
            *dst_phy_path,
            *size_str );
    }
}

Thumbnail Service - interface functions

get_list_of_thumbnails(*src_obj_path, *thumbnail_list) {
    *thumbnail_list = list()
    split_path(*src_obj_path, "/", *col_name, *obj_name)
    # derive a collection name from the logical path
    *thumb_coll_name = "NULL"
    get_thumbnail_collection_name(
        *col_name,
        *obj_name,
        *thumb_coll_name)
    # get the list of possible sizes
    get_thumbnail_sizes(*thumb_sizes)
    foreach( *sz in *thumb_sizes ) {
        get_thumbnail_name(*obj_name, *sz, *thumbnail_name);
        *dst_obj_path = *thumb_coll_name ++ "/" ++ *thumbnail_name
        # does the thumbnail exist
        *err = errormsg(msiObjStat(*dst_obj_path,*obj_stat), *msg)
        if( 0 == *err ) {
            # it does exist, add it to the list
            *thumbnail_list = cons(*dst_obj_path, *thumbnail_list)
        }
    }
}

Thumbnail Service - interface functions

create_thumbnail_impl(*src_obj_path, *dst_obj_path, *dst_phy_path, *size_str) {
    split_path(*src_obj_path, "/", *col_name, *obj_name)

    # capture configuration parameters
    *image_compute_type = "NULL"
    get_image_compute_type(*image_compute_type)
    if("NULL" == *image_compute_type) {
        failmsg(-1,"get_image_compute_type failed")
    }
    writeLine("serverLog", "image_compute_type [*image_compute_type]")

    *lts_compute_type = "NULL"
    get_long_term_resc_type(*lts_compute_type)
    if("NULL" == *lts_compute_type) {
        failmsg(-1,"get_long_term_resc_type failed")
    }
    writeLine("serverLog", "lts_compute_type [*lts_compute_type]")

    *compute_resc_role_attr = "NULL"
    get_compute_resource_role_attribute(*compute_resc_role_attr)
    if("NULL" == *compute_resc_role_attr) {
        failmsg(-1,"get_compute_resource_role_attribute failed")
    }
    writeLine("serverLog", "compute_resc_role_attr [*compute_resc_role_attr]")

1 - 4 :: capture metadata

Thumbnail Service - interface functions

2 - 4 :: Determine resources and replicate to compute

*lts_resc_name = "NULL"

    get_resource_name_by_role(
        *lts_resc_name,
        *compute_resc_role_attr,
        *lts_compute_type);
    if("NULL" == *lts_resc_name) {
        failmsg(-1,"get_resource_name_by_role failed [*lts_resc_name][*compute_resc_role_attr][*lts_compute_type]")
    }
    writeLine("serverLog", "lts_resc_name [*lts_resc_name]")

    *src_resc_id = "NULL"
    get_resc_id_for_data_object_resident_on_image_node(
        *obj_name,
        *col_name,
        *compute_resc_role_attr,
        *image_compute_type,
        *src_resc_id)
    writeLine("serverLog", "src_resc_id [*src_resc_id]")

    *img_resc_name = "NULL"
    # does data object not reside on the required resource?
    if("NULL" == *src_resc_id) {
        replicate_object_to_image_node(
            *src_obj_path,
            *compute_resc_role_attr,
            *image_compute_type,
            *img_resc_name,
            *src_resc_id )
        if("NULL" == *src_resc_id) {
            failmsg(-1, "get_resource_id_by_name failed for [*img_resc_name]")
        }
    }
    writeLine("serverLog", "src_resc_id [*src_resc_id]")

Thumbnail Service - interface functions

3-4 :: determine resource parameters for the job

        *src_resc_name = "NULL"

    get_resource_name_by_id(*src_resc_id, *src_resc_name)
    if("NULL" == *src_resc_name) {
        failmsg(-1,"get_resource_name_by_id failed for [*src_resc_id]")
    }
    writeLine("serverLog", "src_resc_name [*src_resc_name]")

    *src_phy_path = "NULL"
    get_phy_path_for_object_on_resc_id(*obj_name, *src_resc_id, *src_phy_path)
    if("NULL" == *src_phy_path) {
        failmsg(-1,"failed for [*obj_name] [*src_resc_id]")
    }
    writeLine("serverLog", "src_phy_path [*src_phy_path]")

    *server_host = "NULL"
    get_resource_host_by_id(*src_resc_id, *server_host);
    if("NULL" == *server_host) {
        failmsg(-1,"get_resource_host_by_id failed for [*src_resc_id]")
    }
    writeLine("serverLog", "server_host [*server_host]")

Thumbnail Service - interface functions

4-4 :: launch job and capture products

    # launch image computation job
    submit_thumbnail_job(
        *server_host,
        *size_str,
        *src_phy_path,
        *dst_phy_path)

    # launch registration and replication 
    register_and_replicate_thumbnail(
               *server_host,
               *dst_obj_path,
               *src_resc_name,
               *dst_phy_path,
               *lts_resc_name);

}

Thumbnail Service - user space invocation

create_thumbnails {

    *err = errormsg(create_thumbnail_collection(*src_obj_path, *dst_phy_dir), *msg)
    if(0 != *err) {
        writeLine( "stdout", "FAIL: [*err] [*msg]")
    }
}

INPUT *src_obj_path="/tempZone/home/rods/stickers.jpg",*dst_phy_dir="/tmp/irods/thumbnails"
OUTPUT ruleExecOut
create_thumbnails.r

Thumbnail Service - user space invocation

find_thumbnails {
    *thb_list = list()

    *err = errormsg(get_list_of_thumbnails(*src_obj_path, *thb_list), *msg)
    if(0 != *err) {
        writeLine( "stdout", "FAIL: [*err] [*msg]")
    }

    foreach( *t in *thb_list ) {
        writeLine("stdout", "thumbnail [*t]")
    }

}

INPUT *src_obj_path="/tempZone/home/rods/stickers.jpg"
OUTPUT ruleExecOut
find_thumbnails.r

Thumbnail Service - testing

irods@icat:~$ iput /tmp/stickers.jpg
irods@icat:~$ ils -l
/tempZone/home/rods:
  rods              0 demoResc      2157087 2017-05-09.18:42 & stickers.jpg
irods@icat:~$ irule -F create_thumbnails.r
irods@icat:~$ ils -l
/tempZone/home/rods:
  rods              0 demoResc      2157087 2017-05-09.18:42 & stickers.jpg
  rods              1 img_resc      2157087 2017-05-09.18:43 & stickers.jpg
  C- /tempZone/home/rods/stickers_thumbnails

irods@icat:~$ iqstat
...

irods@icat:~$ ils -l /tempZone/home/rods/stickers_thumbnails
/tempZone/home/rods/stickers_thumbnails:
  rods              1 lts_resc       229954 2017-05-09.18:43 & stickers_thumbnail_1024x1024.jpg
  rods              1 lts_resc         6456 2017-05-09.18:43 & stickers_thumbnail_128x128.jpg
  rods              1 lts_resc        19355 2017-05-09.18:43 & stickers_thumbnail_256x256.jpg
  rods              1 lts_resc        63036 2017-05-09.18:43 & stickers_thumbnail_512x512.jpg

irods@icat:~$ irule -F find_thumbnails.r
thumbnail [/tempZone/home/rods/stickers_thumbnails/stickers_thumbnail_1024x1024.jpg]
thumbnail [/tempZone/home/rods/stickers_thumbnails/stickers_thumbnail_512x512.jpg]
thumbnail [/tempZone/home/rods/stickers_thumbnails/stickers_thumbnail_256x256.jpg]
thumbnail [/tempZone/home/rods/stickers_thumbnails/stickers_thumbnail_128x128.jpg]

Extending iRODS with the Rule Engine

  • All rules should be created and tested in user space before being installed as a rule base

  • Rules may be refactored into a microservice plugin

  • Rules may be refactored into a C++ rule engine plugin

  • Rules may be refactored into an API plugin

UGM 2017 - Taking Data to Compute

By iRODS Consortium

UGM 2017 - Taking Data to Compute

Training to accompany the one page data management design pattern: https://irods.org/images/data_to_compute.jpg

  • 1,853