Stian Soiland-Reyes
eScience lab, The University of Manchester
BioExcel/MolSSI workshop
2018-12-13, Barcelona
This work is licensed under a
Creative Commons Attribution 4.0 International License.
This work has been done as part of the BioExcel CoE (www.bioexcel.eu), a project funded by the European Union contract H2020-EINFRA-2015-1-675728.
- Quantify the domain where your workflow is applicable. Think of hard
metrics about the numbers your workflow operates under. e.g. How many users
can it support simultaneously, what is the throughput of jobs, how large
are the calculations you can run individually, how many of those
calculations can you run in parallel, could you run more calculations if
they were smaller.
- Tell the group which scientific areas would be interested in your
workflow. Are you better suited to methods developers or black box users?
How much overlap is there between people who develop on your workflow and
the end users?
Interoperability: not married to one wf system on one compute platform ... even Windows!
Seamless move from laptop to cluster, cloud, HPC
Can reuse workflow snippets and tools from GitHub
(often lacking: attribution, license)
Encourages best practice workflow design (reproducibility, annotation)
.. makes it harder to cheat/hack
(even JavaScript is sandboxed, can only mutate single field)
Learning curve: Moving from procedural scripts to "functional" dataflow paradigm
Many StackOverflow questions come down to learning common design patterns
Syntax (CWL in YAML) was designed for interchange, not user editing
Users want more syntactic sugar (should not affect model)
--> Move to "compiler" paradigm
Error handling: Differences in engine implementations.
E.g. handling nulls, default values, fallback, cascading errors --> CWL v2.0
Implementation zoo: Varying degree of complexity, usability, scalability
.. how to pick CWL engine for your compute needs?
What problems are your users facing in your software that they
have explicitly expressed. E.g. In person, GitHub Issues, Slack
communications, questions at conferences, etc.
- What are your sustainability plans? What is the "Bus Factor" (
https://en.wikipedia.org/wiki/Bus_factor) for your project? Does your
project have a designed termination?
- How do you promote yourself? whats your marketing strategy? What do your
users tell you about how they found you?