How to do a partial expand in Snakemake?

2019-02-19 02:57发布

I'm trying to first generate 4 files, for the LETTERS x NUMS combinations, then summarize over the NUMS to obtain one file per element in LETTERS:

LETTERS = ["A", "B"]
NUMS = ["1", "2"]


rule all:
    input:
        expand("combined_{letter}.txt", letter=LETTERS)

rule generate_text:
    output:
        "text_{letter}_{num}.txt"
    shell:
        """
        echo "test" > {output}
        """

rule combine text:
    input:
        expand("text_{letter}_{num}.txt", num=NUMS)
    output:
        "combined_{letter}.txt"
    shell:
        """
        cat {input} > {output}
        """

Executing this snakefile results in the following error:

WildcardError in line 19 of /tmp/Snakefile:
No values given for wildcard 'letter'.
  File "/tmp/Snakefile", line 19, in <module>

It seems that partial expand is not possible. Is it a limitation of expand ? If so, how should I circumvent it ?

2条回答
我想做一个坏孩纸
2楼-- · 2019-02-19 03:24

Indeed, braces need to be escaped when you want to ignore them in expand. It relies on str.format, and hence any rules from format apply to expand as well.

查看更多
倾城 Initia
3楼-- · 2019-02-19 03:36

It seems that this is not a limitation of expand, but a limitation of my familiarity with the way string-formatting works in python. I need to use double brackets for the non-expanded wildcard:

LETTERS = ["A", "B"]
NUMS = ["1", "2"]


rule all:
    input:
        expand("combined_{letter}.txt", letter=LETTERS)

rule generate_text:
    output:
        "text_{letter}_{num}.txt"
    shell:
        """
        echo "test" > {output}
        """

rule combine text:
    input:
        expand("text_{{letter}}_{num}.txt", num=NUMS)
    output:
        "combined_{letter}.txt"
    shell:
        """
        cat {input} > {output}
        """

Executing this snakefile now generates the expected following files:

text_A_2.txt
text_A_1.txt
text_B_2.txt
text_B_1.txt
combined_A.txt
combined_B.txt
查看更多
登录 后发表回答