Policy Training
Taking Compute to Data
Policy Training
Taking Compute to Data
Jason M. Coposky
@jason_coposky
Executive Director, iRODS Consortium
June 25-28, 2019
iRODS User Group Meeting
University of Utrecht, NL
The Compute to Data Use Case
Data is assumed to already be routed to an appropriate storage resource
Goals - Develop generic interface concept for compute
"Compute To Data" Pattern - Salient Features
Implemented as an iRODS rulebase: following the Template Method pattern
Components of the System
System Component
Job Initialization
Container Technology
User Provided Compute
Implementation
iRODS Rule Base
Docker
Jupyter Notebook
Getting Started
git clone https://github.com/irods/irods_training
sudo apt-get -y install \
irods-externals-cmake3.5.2-0 \
irods-externals-clang3.8-0 \
irods-externals-qpid-with-proton0.34-0 \
irods-dev
export PATH=/opt/irods-externals/cmake3.5.2-0/bin:$PATH
Clone irods_training repository and configure build tools
As the ubuntu user (if necessary)
Getting Started
cd mkdir build_compute_to_data cd build_compute_to_data cmake ../irods_training/advanced/hpc_compute_to_data make package sudo dpkg -i irods-hpc-compute-to-data-example_4.2.6~xenial_amd64.deb cd mkdir build_register_microservice cd build_register_microservice cmake ../irods_training/advanced/hpc_compute_to_data/msvc__msiregister_as_admin/ make package sudo dpkg -i irods-microservice-register_as_admin-4.2.6-ubuntu16-x86_64.deb
Build and Install packages for the compute-to-data example
cd /home/ubuntu/irods_training/advanced/hpc_compute_to_data/jupyter_notebook docker build -t testimages/jupyter-digital-filter .
Build Docker image for processing
Getting Started
Install python's pip package
Make sure the Python rule engine plugin is installed.
sudo apt-get -y install irods-rule-engine-plugin-python
Add irods user to the docker group
sudo apt-get -y install python-pip
sudo usermod -aG docker irods
Getting Started
As the irods user - install the Python Docker API
sudo service irods restart
or
sudo su irods -c '~/irodsctl restart'
Restart the irods server
pip install docker --user
Further Setup and Configuration
Edit /etc/irods/server_config.json
"rule_engines": [
{
"instance_name": "irods_rule_engine_plugin-irods_rule_language-instance",
"plugin_name": "irods_rule_engine_plugin-irods_rule_language",
...
"shared_memory_instance": "irods_rule_language_rule_engine"
},
{
"instance_name": "irods_rule_engine_plugin-python-instance",
"plugin_name": "irods_rule_engine_plugin-python",
"plugin_specific_configuration": {}
},
. . .
Create /etc/irods/core.py with the following import:
from compute_to_data import *
Configure Python Rule Engine Plugin
Getting Started
iadmin mkuser alice rodsuser
iadmin moduser alice password apass
This demonstration will be run as rodsuser 'alice'
iadmin mkresc lts_resc unixfilesystem `hostname`:/tmp/irods/lts_resc
iadmin mkresc dsp_resc unixfilesystem `hostname`:/tmp/irods/dsp_resc
imeta add -R lts_resc COMPUTE_RESOURCE_ROLE LONG_TERM_STORAGE
imeta add -R dsp_resc COMPUTE_RESOURCE_ROLE SIGNAL_PROCESSING
Create two unixfilesystem resources
Annotate them with appropriate metadata given their roles
This is defined in the configuration as part of the contract
Finally ...
ubuntu$ iinit ERROR: environment_properties::capture: missing environment file. should be at [/home/ubuntu/.irods/irods_environment.json] One or more fields in your iRODS environment file (irods_environment.json) are missing; please enter them. Enter the host name (DNS) of the server to connect to: localhost Enter the port number: 1247 Enter your irods user name: alice Enter your irods zone: tempZone Those values will be added to your environment file (for use by other iCommands) if the login succeeds. Enter your current iRODS password:
ubuntu$ ils
/tempZone/home/alice:
ubuntu$
Remember to log in as 'alice' in the ubuntu training account:
The configuration interface
Define interfaces for any necessary conventions
Single Point of Truth - Template Method Pattern
Users may utilize metadata conventions within a rule to provide inputs to the generalized container service.
Reminder ...
Implemented as an iRODS rulebase: following the Template Method pattern
The iRODS Rule Language Rule File
main { container_dispatch("containers.run","/tempZone/home/alice/task_config.json","dsp_resc","","") } INPUT null OUTPUT ruleExecOut
irule provies a user-land entry point for the invocation of the Compute to Data Policy
/home/ubuntu/spawn_remote_containers.r
Task Configuration
{
"container": {
"type": "docker",
"image": "testimages/jupyter-digital-filter",
"command": [ "jupyter", "nbconvert",
"--execute",
"--to", "html",
"--output", "/outputs/lowpass_filter_processing.html",
"/home/jovyan/work/lpfilter.ipynb"
],
"environment": {
"INPUT_FILE_PATH" : "/inputs/%(INPUT_FILE_BASENAME)s",
"CUTOFF_FREQUENCY_INDEX" : "0",
"OUTPUT_FILE_PATH" : "/outputs/lowpass_filtered_%(INPUT_FILE_BASENAME)s"
}
},
"external": {
"src_collection" : "/tempZone/home/alice/notebook_input",
"dst_collection" : "/tempZone/home/alice/notebook_output"
},
"internal": {
"src_directory": "/inputs",
"dst_directory": "/outputs"
}
}
Task Configuration
INPUT_FILE_BASENAME : internally computed value derived from first input file found in input collection
type : 'docker' or 'singularity'
image : reference name for repository
command : command, args, for, command
environment : configuration passed through to container
external : logical iRODS source and destination paths for data
internal : paths mapped into docker to local storage from iRODS physical paths on the target resource
The Digital Signal Processing container
FROM jupyter/base-notebook ARG irods_gid=999 ENV IRODS_GID ${irods_gid} USER root RUN apt-get update && apt-get install -y vim less RUN groupadd -g $IRODS_GID irods && usermod -aG irods jovyan RUN sed -i "s/jovyan:x:[0-9]*:[0-9]*\(.*\)/jovyan:x:999:999\1/" /etc/passwd ADD lpfilter.ipynb /home/jovyan/work/. COPY mymodule/ /home/jovyan/work/mymodule/ RUN chown jovyan.users /home/jovyan/work/lpfilter.ipynb COPY mymodule/ /home/jovyan/work/mymodule RUN chown -R jovyan.users /home/jovyan/work/mymodule RUN chown -R 999:999 /home/jovyan && chown -R 999:999 /opt/conda USER jovyan RUN conda init RUN conda install -y -c conda-forge matplotlib numpy RUN jupyter trust /home/jovyan/work/lpfilter.ipynb CMD [ '/bin/bash' ]
~/irods_training/advanced/hpc_compute_to_data/jupyter_notebook/Dockerfile
The Jupyter Notebook
/home/ubuntu/irods_training/advanced/hpc_compute_to_data/jupyter_notebook/lpfilter.ipynb
Located in the training repository at:
The notebook:
Compute to Data - Digital Filter Testing
ubuntu $ icd ; imkdir notebook_input notebook_output
ubuntu $ cd ; iput task_config.json
ubuntu $ for x in {1..512}; do echo $((x%24)) ; done >input.dat
ubuntu $ iput input.dat notebook_input
ubuntu $ ils -lr
/tempZone/home/alice:
alice 0 demoResc 853 2019-06-21.16:05 & task_config.json
C- /tempZone/home/alice/notebook_input
/tempZone/home/alice/notebook_input:
alice 0 demoResc 1318 2019-06-21.16:05 & input.dat
C- /tempZone/home/alice/notebook_output
/tempZone/home/alice/notebook_output:
Compute to Data - Digital Filter Testing
ubuntu $ irule -F spawn_remote_containers.r
ubuntu $ ils -lr
/tempZone/home/alice:
alice 0 demoResc 853 2019-06-21.16:05 & task_config.json
C- /tempZone/home/alice/notebook_input
/tempZone/home/alice/notebook_input:
alice 0 demoResc 1318 2019-06-21.16:05 & input.dat
alice 1 dsp_resc 1318 2019-06-21.16:06 & input.dat
C- /tempZone/home/alice/notebook_output
/tempZone/home/alice/notebook_output:
alice 0 dsp_resc 0 2019-06-21.16:06 & .8d63a286-943e-11e9-8013-12cc2f55e24c
C- /tempZone/home/alice/notebook_output/8d63a286-943e-11e9-8013-12cc2f55e24c
/tempZone/home/alice/notebook_output/8d63a286-943e-11e9-8013-12cc2f55e24c:
alice 0 dsp_resc 0 2019-06-21.16:06 & .8d63a286-943e-11e9-8013-12cc2f55e24c
alice 0 dsp_resc 3200 2019-06-21.16:06 & lowpass_filtered_input.dat
alice 0 dsp_resc 359430 2019-06-21.16:06 & lowpass_filter_processing.html
Compute to Data - Digital Filter Results
sudo su - irods
cd /tmp/irods/dsp_resc/home/alice/notebook_output
python -m SimpleHTTPServer 8080
Navigate to HTML file under notebook_output
Compute to Data - Digital Filter Results
picture here of results
Thank you
Any Questions?