Stian Soiland-Reyes (helped by Robin Long & Michael Crusoe)
eScience lab, The University of Manchester
BioExcel Virtual Training
2020-04-29
This work is licensed under a
Creative Commons Attribution 4.0 International License.
This work has been done as part of the BioExcel CoE (www.bioexcel.eu), a project funded by the European Union contracts H2020-INFRAEDI-02-2018-823830, H2020-EINFRA-2015-1-675728
This tutorial follows the CWL User Guide
https://www.commonwl.org/user_guide/
For convenience we will use Virtual Machines from the
BioExcel Cloud Portal
We will connect using Visual Studio Code and SSH.
Note: Cloud portal VMs are only available during Virtual Training session.
If you are following this tutorial at a later point,
you will need to use a local computer
where you can install cwltool and Docker
Access to portal was available to participants subscribed to the Virtual Training session.
If you are reading this later, skip to
step #0 installation
Make sure you are on the team
CWL Training
Send questions in GoToTraining chat
Send to:Organizers
CWL can be written using any text editor.
Here we'll use Visual Studio Code
as it can connect remotely to our virtual machine.
Download from https://code.visualstudio.com/download
and install on your machine.
Old-skool alternative: Use ssh to VM, run vim from shell
To create/access our VM we need an SSH key (public/private encryption pair)
Make an SSH key:
Windows users:
Now copy your public SSH key to the clipboard:
Note: You do not need Virtual Machines to run CWL;
here we use VMs to ensure a consistent training experience.
Follow the BioExcel Cloud Portal instructions
https://longr.github.io/cwl-virtual-tutorial/build_vm.html
Paste in your public SSH key from before., e.g.:
ssh-rsa AAAAB3NzaC1...vr0kA2L mchssss4@ds@vm-RNNK3N1
Wait for VM to be deployed (5-10 mins), then verify using ssh
https://longr.github.io/cwl-virtual-tutorial/accessing_vm.html
Send questions in GoToTraining chat
Send to:Organizers
To access the remote machine from VS Code, install and use the extension Remote - SSH
Follow instructions on
https://longr.github.io/cwl-virtual-tutorial/connecting_via_vscode.html
Open folder
/home/ubuntu
ubuntu@tsi1588147782483-1:~/training$ python3 --version
Python 3.6.8
Tip: Not all CWL implementations use Python.
cromwell use Scala
CWLEXEC use Java
ubuntu@tsi1588147782483-1:~/training$ docker run hello-world
Hello from Docker!
This message shows that your installation appears to be working correctly.
To generate this message, Docker took the following steps:
1. The Docker client contacted the Docker daemon.
2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
(amd64)
3. The Docker daemon created a new container from that image which runs the
executable that produces the output you are currently reading.
4. The Docker daemon streamed that output to the Docker client, which sent it
to your terminal.
To try something more ambitious, you can run an Ubuntu container with:
$ docker run -it ubuntu bash
Share images, automate workflows, and more with a free Docker ID:
https://hub.docker.com/
For more examples and ideas, visit:
https://docs.docker.com/get-started/
ubuntu@tsi1588147782483-1:~/training$ pip3 install cwlref-runner
Processing ./.cache/pip/wheels/f2/5f/5f/8fc64a099199682669af6a9088cfb2f161d60570a840b0cf9e/cwlref_runner-1.0-py3-none-any.whl
Collecting cwltool
Using cached cwltool-3.0.20200324120055-py3-none-any.whl (800 kB)
Collecting typing-extensions
Using cached typing_extensions-3.7.4.2-py3-none-any.whl (22 kB)
Collecting pathlib2!=2.3.1
Using cached pathlib2-2.3.5-py2.py3-none-any.whl (18 kB)
..
Collecting decorator>=4.3.0
Using cached decorator-4.4.2-py2.py3-none-any.whl (9.2 kB)
Installing collected packages: typing-extensions, six, pathlib2, decorator, networkx, pyparsing, isodate, rdflib, python-dateutil, lxml, prov, bagit, idna, certifi, chardet, urllib3, requests, lockfile, rdflib-jsonld, mistune, CacheControl, ruamel.yaml.clib, ruamel.yaml, schema-salad, mypy-extensions, humanfriendly, coloredlogs, psutil, shellescape, cwltool, cwlref-runner
Successfully installed CacheControl-0.11.7 bagit-1.7.0 certifi-2020.4.5.1 chardet-3.0.4 coloredlogs-14.0 cwlref-runner-1.0 cwltool-3.0.20200324120055 decorator-4.4.2 humanfriendly-8.2 idna-2.9 isodate-0.6.0 lockfile-0.12.2 lxml-4.5.0 mistune-0.8.4 mypy-extensions-0.4.3 networkx-2.4 pathlib2-2.3.5 prov-1.5.1 psutil-5.7.0 pyparsing-2.4.7 python-dateutil-2.8.1 rdflib-4.2.2 rdflib-jsonld-0.5.0 requests-2.23.0 ruamel.yaml-0.16.5 ruamel.yaml.clib-0.2.0 schema-salad-5.0.20200416112825 shellescape-3.4.1 six-1.14.0 typing-extensions-3.7.4.2 urllib3-1.25.9wl_bioexcel/lib/python3.6/site-packages (from networkx->prov==1.5.1->cwltool->cwlref-runner) (4.4.2)
ubuntu@tsi1588147782483-1:~/training$ cwltool --version
/home/ubuntu/cwl_bioexcel/bin/cwltool 2.0.20200126090152
ubuntu@tsi1588147782483-1:~/training$ cwl-runner --version
/home/ubuntu/cwl_bioexcel/bin/cwl-runner 2.0.20200126090152
Text
Note: Other CWL engines (e.g. toil) installed the same Python environment may lock cwltool to an older version
ubuntu@tsi1588147782483-1:~/training$ pip3 install toil[cwl]
Collecting toil[cwl]
Using cached toil-4.0.0-py3-none-any.whl (464 kB)
Collecting pytz>=2012
Using cached pytz-2020.1-py2.py3-none-any.whl (510 kB)
Collecting docker==2.5.1
Using cached docker-2.5.1-py2.py3-none-any.whl (111 kB)
Collecting pathlib2==2.3.2
Using cached pathlib2-2.3.2-py2.py3-none-any.whl (16 kB)
Processing ./.cache/pip/wheels/6e/9c/ed/4499c9865ac1002697793e0ae05ba6be33553d098f3347fb94/future-0.18.2-py3-none-any.whl
Processing ./.cache/pip/wheels/01/63/4e/4513b03a36916a4988ba9dd0c0483e30f4973cc4b4ba56fb53/addict-2.2.0-py3-none-any.whl
Processing ./.cache/pip/wheels/a1/d9/f2/b5620c01e9b3e858c6877b1045fda5b115cf7df6490f883382/psutil-5.7.0-cp36-cp36m-linux_x86_64.whl
Collecting six>=1.10.0
Using cached six-1.14.0-py2.py3-none-any.whl (10 kB)
Collecting requests<3,>=2
Using cached requests-2.23.0-py2.py3-none-any.whl (58 kB)
Collecting decorator>=4.3.0
Using cached decorator-4.4.2-py2.py3-none-any.whl (9.2 kB)
Installing collected packages: pytz, six, websocket-client, docker-pycreds, certifi, chardet, idna, urllib3, requests, docker, pathlib2, future, addict, psutil, dill, python-dateutil, markupsafe, pyparsing, packaging, boltons, repoze.lru, routes, docutils, pyyaml, webencodings, bleach, galaxy-util, galaxy-containers, galaxy-tool-util, bagit, humanfriendly, coloredlogs, lxml, isodate, rdflib, decorator, networkx, prov, ruamel.yaml.clib, ruamel.yaml, shellescape, typing-extensions, mistune, rdflib-jsonld, lockfile, CacheControl, schema-salad, mypy-extensions, cwltool, toil
Successfully installed CacheControl-0.11.7 addict-2.2.0 bagit-1.7.0 bleach-3.1.4 boltons-20.1.0 certifi-2020.4.5.1 chardet-3.0.4 coloredlogs-14.0 cwltool-2.0.20200126090152 decorator-4.4.2 dill-0.2.7.1 docker-2.5.1 docker-pycreds-0.4.0 docutils-0.16 future-0.18.2 galaxy-containers-19.9.0 galaxy-tool-util-19.9.1 galaxy-util-19.9.0 humanfriendly-8.2 idna-2.9 isodate-0.6.0 lockfile-0.12.2 lxml-4.5.0 markupsafe-1.1.1 mistune-0.8.4 mypy-extensions-0.4.3 networkx-2.4 packaging-20.3 pathlib2-2.3.2 prov-1.5.1 psutil-5.7.0 pyparsing-2.4.7 python-dateutil-2.8.1 pytz-2020.1 pyyaml-5.3.1 rdflib-4.2.2 rdflib-jsonld-0.5.0 repoze.lru-0.7 requests-2.23.0 routes-2.4.1 ruamel.yaml-0.16.5 ruamel.yaml.clib-0.2.0 schema-salad-5.0.20200416112825 shellescape-3.4.1 six-1.14.0 toil-4.0.0 typing-extensions-3.7.4.2 urllib3-1.25.9 webencodings-0.5.1 websocket-client-0.57.0
ubuntu@tsi1588147782483-1:~/training$ toil --version
4.0.0
ubuntu@tsi1588147782483-1:~$ toil-cwl-runner --version
4.0.0
ubuntu@tsi1588147782483-1:~$ virtualenv -p python3 ~/toil
Running virtualenv with interpreter /home/ubuntu/cwl_bioexcel/bin/python3
Using real prefix '/usr'
Path not in prefix '/home/ubuntu/cwl_bioexcel/include/python3.6m' '/usr'
New python executable in /home/ubuntu/toil/bin/python3
Also creating executable in /home/ubuntu/toil/bin/python
Installing setuptools, pkg_resources, pip, wheel...done.
ubuntu@tsi1588147782483-1:~$ . ~/toil/bin/activate
(toil) ubuntu@tsi1588147782483-1:~$
(toil) ubuntu@tsi1588147782483-1:~$ pip3 install toil[cwl]
Collecting toil[cwl]
Using cached toil-4.0.0-py3-none-any.whl (464 kB)
Collecting pytz>=2012
Using cached pytz-2020.1-py2.py3-none-any.whl (510 kB)
...
Collecting requests<3,>=2
Using cached requests-2.23.0-py2.py3-none-any.whl (58 kB)
Collecting decorator>=4.3.0
Using cached decorator-4.4.2-py2.py3-none-any.whl (9.2 kB)
Installing collected packages: pytz, six, websocket-client, docker-pycreds, certifi, chardet, idna, urllib3, requests, docker, pathlib2, future, addict, psutil, dill, python-dateutil, markupsafe, pyparsing, packaging, boltons, repoze.lru, routes, docutils, pyyaml, webencodings, bleach, galaxy-util, galaxy-containers, galaxy-tool-util, bagit, humanfriendly, coloredlogs, lxml, isodate, rdflib, decorator, networkx, prov, ruamel.yaml.clib, ruamel.yaml, shellescape, typing-extensions, mistune, rdflib-jsonld, lockfile, CacheControl, schema-salad, mypy-extensions, cwltool, toil
Successfully installed CacheControl-0.11.7 addict-2.2.0 bagit-1.7.0 bleach-3.1.4 boltons-20.1.0 certifi-2020.4.5.1 chardet-3.0.4 coloredlogs-14.0 cwltool-2.0.20200126090152 decorator-4.4.2 dill-0.2.7.1 docker-2.5.1 docker-pycreds-0.4.0 docutils-0.16 future-0.18.2 galaxy-containers-19.9.0 galaxy-tool-util-19.9.1 galaxy-util-19.9.0 humanfriendly-8.2 idna-2.9 isodate-0.6.0 lockfile-0.12.2 lxml-4.5.0 markupsafe-1.1.1 mistune-0.8.4 mypy-extensions-0.4.3 networkx-2.4 packaging-20.3 pathlib2-2.3.2 prov-1.5.1 psutil-5.7.0 pyparsing-2.4.7 python-dateutil-2.8.1 pytz-2020.1 pyyaml-5.3.1 rdflib-4.2.2 rdflib-jsonld-0.5.0 repoze.lru-0.7 requests-2.23.0 routes-2.4.1 ruamel.yaml-0.16.5 ruamel.yaml.clib-0.2.0 schema-salad-5.0.20200416112825 shellescape-3.4.1 six-1.14.0 toil-4.0.0 typing-extensions-3.7.4.2 urllib3-1.25.9 webencodings-0.5.1 websocket-client-0.57.0
(cwl) ubuntu@tsi1588147782483-1:~$ . ~/toil/bin/activate
(toil) (cwl) ubuntu@tsi1588147782483-1:~$ type toil-cwl-runner
toil-cwl-runner is /home/ubuntu/toil/bin/toil-cwl-runner
(base) ubuntu@tsi1588147782483-1:~$ conda create -n cwl cwltool
Collecting package metadata (current_repodata.json): done
Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.
Collecting package metadata (repodata.json): done
Solving environment: done
...
xorg-xextproto conda-forge/linux-64::xorg-xextproto-7.3.0-h14c3975_1002
xorg-xproto conda-forge/linux-64::xorg-xproto-7.0.31-h14c3975_1007
xz conda-forge/linux-64::xz-5.2.5-h516909a_0
zlib conda-forge/linux-64::zlib-1.2.11-h516909a_1006
zstd conda-forge/linux-64::zstd-1.4.4-h6597ccf_3
Proceed ([y]/n)? y
Downloading and Extracting Packages
libpng-1.6.37 | 308 KB | ######################################################## | 100%
shellescape-3.4.1 | 7 KB | ######################################################## | 100%
libuuid-2.32.1 | 26 KB | ######################################################## | 100%
decorator-4.4.2 | 11 KB | ######################################################## | 100%
pathlib2-2.3.5 | 34 KB | ######################################################## | 100%
readline-8.0 | 441 KB | ######################################################## | 100%
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
#
# To activate this environment, use
#
# $ conda activate cwl
(base) ubuntu@tsi1588147782483-1:~$ conda activate cwl
(cwl) ubuntu@tsi1588147782483-1:~/training$ type cwltool
cwltool is /home/ubuntu/miniconda3/envs/cwl/bin/cwltool
(cwl) ubuntu@tsi1588147782483-1:~/training$ cwltool --version
/home/ubuntu/miniconda3/envs/cwl/bin/cwltool 3.0.20200317203547
(cwl) ubuntu@tsi1588147782483-1:~/training$ cwl-runner --version
/home/ubuntu/cwl_bioexcel/bin/cwl-runner 2.0.20200126090152
CWL is written in a text file format called YAML
YAML is similar to JSON, in that it can make object structures of
YAML syntax is intended for writing rather than parsing, and so you can skip most of the JSON characters and use indentation blocks instead
It is therefore important that you pay attention to consistent indentation when working with this tutorial.
Exercise
Challenge
Exercise
Challenge
Exercise
Challenge
Exercise
Not tired?
You can either work on the Challenges before,
or follow step #0 to install locally
Continue user guide at own pace from https://www.commonwl.org/user_guide/07-containers/ onwards.
Take particular notice of how to write a workflow and iterate with scattering