Service jobs

rule the_service:
    output:
        service("foo.socket")
    shell:
        # here we simulate some kind of server process that provides data via a socket
        "ln -s /dev/random {output}; sleep 10000"


# The two consumers below access the socket. Via marking the output of the_service above
# as a "service", Snakemake knows that the job has to remain alive until all consumers
# are finished. Then, Snakemake will automatically terminate (SIGTERM) the service job.

# In a cluster/cloud scenario, Snakemake will submit the service together with all consumers
# to the same node.

rule consumer1:
    input:
        "foo.socket"
    output:
        "test.txt"
    shell:
        "head -n1 {input} > {output}"


rule consumer2:
    input:
        "foo.socket"
    output:
        "test2.txt"
    shell:
        "head -n1 {input} > {output}"

Define rules that provide shared memory resources, sockets, databases, etc.

https://snakemake.github.io

Group-local service jobs

rule the_service:
    output:
        service("foo.{groupid}.socket")
    shell:
        # here we simulate some kind of server process that provides data via a socket
        "ln -s /dev/random {output}; sleep 10000"

# To make a service job group-local, use groupid in the input functions of the consumers
# and pass it as a wildcard to the service job. This way, there will be one service job
# per job group in a cluster/cloud scenario. In case of local execution, there will be 
# just one service job without any modification needed.

# This pattern for group-local execution is not limited to service jobs and can be used for
# any kind of job.
def get_socket(wildcards, groupid):
    return f"foo.{groupid}.socket"


rule consumer1:
    input:
        get_socket
    output:
        "test.txt"
    shell:
        "head -n1 {input} > {output}"


rule consumer2:
    input:
        get_socket
    output:
        "test2.txt"
    shell:
        "head -n1 {input} > {output}"

Ensure that there is one separate service job

per cluster/cloud job group.

https://snakemake.github.io

Integrated template rendering

# Jinja2 is one of the most popular generic template engines.

# This rule renders the input template into a concrete file.
rule render_jinja2_template:
    input:
        "some-jinja2-template.txt"
    output:
        "results/{sample}.rendered-version.txt"
    params:
        someparam=0.03
    template_engine:
        "jinja2"
        

# YTE is a novel template engine specialized on rendering YAML files
# while exploiting YAML syntax for control flow in combination with
# Python expressions.

# This rule renders the input template into a concrete YAML file,
# that might be used by a consuming rule to configure some tool.
rule render_yaml_template:
    input:
        "some-yte-template.yaml"
    output:
        "results/{sample}.rendered-version.yaml"
    params:
        someparam=0.03
    template_engine:
        "yte"

Rendering config files for subsequent jobs is now directly supported via a native template engine integration.

https://snakemake.github.io

Sometimes CLI interfaces are too limited to describe complex parameters. In such cases, tools require config files as input (e.g. Circos or Varlociraptor).

Improved cluster execution

Define cancel commands and sidecars for

cluster submission

(contributed by Manuel Holtgrewe, BIH)

https://snakemake.github.io

# Define submission, cancel and status commands for any cluster. 
# This works best when persisting into a profile: https://github.com/Snakemake-Profiles/doc

snakemake --cluster ... --cluster-status ... --cluster-cancel ... --cluster-cancel-nargs 1000 --cluster-sidecar ...

Snakemake 7.0

By Johannes Köster

Snakemake 7.0

New features in Snakemake 7.0

3 years ago
1,559

Service jobs

Group-local service jobs

Integrated template rendering

Improved cluster execution

Snakemake 7.0

More from Johannes Köster