Snakemake 7.0
New Features:
- service jobs
- group local execution
- template engine integration
- improved cluster execution
Breaking change:
- now requires Python >=3.7
this also means it is safe to use
f-strings in workflows
Service jobs
rule the_service:
output:
service("foo.socket")
shell:
# here we simulate some kind of server process that provides data via a socket
"ln -s /dev/random {output}; sleep 10000"
# The two consumers below access the socket. Via marking the output of the_service above
# as a "service", Snakemake knows that the job has to remain alive until all consumers
# are finished. Then, Snakemake will automatically terminate (SIGTERM) the service job.
# In a cluster/cloud scenario, Snakemake will submit the service together with all consumers
# to the same node.
rule consumer1:
input:
"foo.socket"
output:
"test.txt"
shell:
"head -n1 {input} > {output}"
rule consumer2:
input:
"foo.socket"
output:
"test2.txt"
shell:
"head -n1 {input} > {output}"
Define rules that provide shared memory resources, sockets, databases, etc.
Group-local service jobs
rule the_service:
output:
service("foo.{groupid}.socket")
shell:
# here we simulate some kind of server process that provides data via a socket
"ln -s /dev/random {output}; sleep 10000"
# To make a service job group-local, use groupid in the input functions of the consumers
# and pass it as a wildcard to the service job. This way, there will be one service job
# per job group in a cluster/cloud scenario. In case of local execution, there will be
# just one service job without any modification needed.
# This pattern for group-local execution is not limited to service jobs and can be used for
# any kind of job.
def get_socket(wildcards, groupid):
return f"foo.{groupid}.socket"
rule consumer1:
input:
get_socket
output:
"test.txt"
shell:
"head -n1 {input} > {output}"
rule consumer2:
input:
get_socket
output:
"test2.txt"
shell:
"head -n1 {input} > {output}"
Ensure that there is one separate service job
per cluster/cloud job group.
Integrated template rendering
# Jinja2 is one of the most popular generic template engines.
# This rule renders the input template into a concrete file.
rule render_jinja2_template:
input:
"some-jinja2-template.txt"
output:
"results/{sample}.rendered-version.txt"
params:
someparam=0.03
template_engine:
"jinja2"
# YTE is a novel template engine specialized on rendering YAML files
# while exploiting YAML syntax for control flow in combination with
# Python expressions.
# This rule renders the input template into a concrete YAML file,
# that might be used by a consuming rule to configure some tool.
rule render_yaml_template:
input:
"some-yte-template.yaml"
output:
"results/{sample}.rendered-version.yaml"
params:
someparam=0.03
template_engine:
"yte"
Rendering config files for subsequent jobs is now directly supported via a native template engine integration.
Sometimes CLI interfaces are too limited to describe complex parameters. In such cases, tools require config files as input (e.g. Circos or Varlociraptor).
Improved cluster execution
# Define submission, cancel and status commands for any cluster.
# This works best when persisting into a profile: https://github.com/Snakemake-Profiles/doc
snakemake --cluster ... --cluster-status ... --cluster-cancel ... --cluster-cancel-nargs 1000 --cluster-sidecar ...
Snakemake 7.0
By Johannes Köster
Snakemake 7.0
New features in Snakemake 7.0
- 1,471