Computational Workflows

Stian Soiland-Reyes,
The University of Manchester

https://doi.org/10.1038/d41586-019-01307-2

They ride with what I refer to as the four horsemen of the reproducibility apocalypse:
  1. Publication bias
  2. Low statistical power
  3. P-value hacking
  4. HARKing (hypothesizing after results are known)

 

 

Reproducibility?

Reproducibility?

Automation

– Automate computational aspects
– Repetitive pipelines, sweep campaigns

Scalingcompute cycles

– Make use of computational infrastructure
– Handle large data

Abstraction⁠—people cycles

– Shield complexity and incompatibilities
– Report, re-usue, evolve, share, compare
– Repeat—Tweak—Repeat
– First-class commodities

Provenancereporting

– Capture, report and utilize log and data lineage
– Auto-documentation
– Tracable evolution, audit, transparency
– Reproducible science

Findable

Accessible

Interoperable

Reusable

(Reproducible)

Why use workflows?

cwlVersion: v1.0
class: Workflow
inputs:
  inp: File
  ex: string

outputs:
  classout:
    type: File
    outputSource: compile/classfile

steps:
  untar:
    run: tar-param.cwl
    in:
      tarfile: inp
      extractfile: ex
    out: [example_out]

  compile:
    run: arguments.cwl
    in:
      src: untar/example_out
    out: [classfile]

Nature 573, 149-150 (2019)
https://doi.org/10.1038/d41586-019-02619-z

cwlVersion: v1.0
class: Workflow

inputs:
  toConvert: File


outputs:
  converted: 
    type: File
    outputSource: convertMethylation/converted
  combined: 
    type: File
    outputSource: mergeSymmetric/combined


steps:
  convertMethylation:
    run: interconverter.cwl
    in:
      toConvert: toConvert
    out: [converted]
  mergeSymmetric:
    run: symmetriccpgs.cwl
    in:
      toCombine: convertMethylation/converted
    out: [combined]
cwlVersion: v1.0
class: CommandLineTool
inputs:
  toConvert:
    type: File
    inputBinding:
      prefix: -i
outputs:
  converted:
    type: File
    outputBinding:
      glob: "*.meth"
baseCommand: interconverter.sh
arguments: ["-d", $(runtime.outdir)]

hints:
  - class: DockerRequirement
    dockerPull: "quay.io/neksa/screw-tool"
cwlVersion: v1.0
class: CommandLineTool
inputs:
  toCombine:
    type: File
    inputBinding:
      prefix: -i
outputs:
  combined:
    type: File
    outputBinding:
      glob: "*.sym"

baseCommand: symmetriccpgs.sh
arguments: ["-d", $(runtime.outdir)]

hints:
  - class: DockerRequirement
    dockerPull: "quay.io/neksa/screw-tool"

2023-07-11 Computational Workflows

By Stian Soiland-Reyes

2023-07-11 Computational Workflows

  • 250