shell - 使用 shell() 执行多个 shell 命令的推荐方法

Question

在snakemake中，使用shell()函数执行多条命令的推荐方式是什么？

score 48 · Accepted Answer

您可以在规则块shell()内多次调用run（规则可以指定run:而不是shell:）：

rule processing_step:
    input:
        # [...]
    output:
        # [...]
    run:
        shell("somecommand {input} > tempfile")
        shell("othercommand tempfile {output}")

否则，由于 run 块接受 Python 代码，您可以将命令列表构建为字符串并对其进行迭代：

rule processing_step:
    input:
        # [...]
    output:
        # [...]
    run:
        commands = [
            "somecommand {input} > tempfile",
            "othercommand tempfile {output}"
        ]
        for c in commands:
            shell(c)

如果在执行规则期间不需要 Python 代码，则可以在shell块中使用三引号字符串，并像在 shell 脚本中一样编写命令。对于纯 shell 规则，这可以说是最易读的：

rule processing_step:
    input:
        # [...]
    output:
        # [...]
    shell:
        """
        somecommand {input} > tempfile
        othercommand tempfile {output}
        """

如果 shell 命令取决于前面命令的成功/失败，则可以将它们与常用的 shell 脚本运算符连接起来，例如||and &&：

rule processing_step:
    input:
        # [...]
    output:
        # [...]
    shell:
        "command_one && echo 'command_one worked' || echo 'command_one failed'"

score 3 · Accepted Answer

以为我会在这个例子中抛出。这可能不是对用户问题的直接回答，但我在搜索类似内容并试图弄清楚如何运行多个 shell 命令并在特定目录中运行其中一些命令时遇到了这个问题（出于各种原因）。

为了保持清洁，您可以使用 shell 脚本。

假设我有一个scripts/run_somecommand.sh执行以下操作的 shell 脚本：

#!/usr/bin/env sh
input=$(realpath $1)
output=$(realpath $2)
log=$(realpath $3)
sample="$4"

mkdir -p data/analysis/${sample}
cd data/analysis/${sample}
somecommand --input ${input} --output ${output} 2> ${log}

然后在你的 Snakemake 规则中你可以这样做

rule somerule:
    input:
        "data/{sample}.fastq"
    output:
        "data/analysis/{sample}/{sample}_somecommand.json"
    log:
        "logs/somecommand_{sample}.log"
    shell:
        "scripts/run_somecommand.sh {input} {output} {log} {sample}"

注意：如果您在 Mac 上工作并且没有，realpath您可以使用它进行安装，brew install coreutils这是一个超级方便的命令。

shell - 使用 shell() 执行多个 shell 命令的推荐方法

2 回答 2

Related

Reference