0

我在同时使用 cfncluster 和 snakemake 上的本地暂存空间时遇到问题。我的策略是将数据写入集群中每个节点的本地暂存区,然后将数据移动到 NFS 分区。不幸的是,我收到以下错误:

蛇形 4.0.0,cfncluster

 /shared/bin/bin/snakemake --rerun-incomplete  -s /shared/scripts/sra_to_fa_cluster.py -j 1 -p --latency-wait 20  -k  -c " qsub -cwd -V" -F 

/shared/dbGAP/sra_toolkit/sratoolkit.2.8.2-1-ubuntu64/bin/fastq-dump  --split-files --gzip  --outdir /scratch/   /shared/dbGAP/sras2/test/SRR2135300.sra
Waiting at most 20 seconds for missing files.
Exception in thread Thread-1:
Traceback (most recent call last):
  File "/shared/bin/lib/python3.6/site-packages/snakemake/dag.py", line 319, in check_and_touch_output
    wait_for_files(expanded_output, latency_wait=wait)
  File "/shared/bin/lib/python3.6/site-packages/snakemake/io.py", line 395, in wait_for_files
    latency_wait, "\n".join(get_missing())))
OSError: Missing files after 20 seconds:
/scratch/SRR2135300_2.fastq.gz

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/shared/bin/lib/python3.6/threading.py", line 916, in _bootstrap_inner
    self.run()
  File "/shared/bin/lib/python3.6/threading.py", line 864, in run
    self._target(*self._args, **self._kwargs)
  File "/shared/bin/lib/python3.6/site-packages/snakemake/executors.py", line 647, in _wait_for_jobs
    active_job.callback(active_job.job)
  File "/shared/bin/lib/python3.6/site-packages/snakemake/scheduler.py", line 287, in _proceed
    self.get_executor(job).handle_job_success(job)
  File "/shared/bin/lib/python3.6/site-packages/snakemake/executors.py", line 549, in handle_job_success
    super().handle_job_success(job, upload_remote=False)
  File "/shared/bin/lib/python3.6/site-packages/snakemake/executors.py", line 178, in handle_job_success
    ignore_missing_output=ignore_missing_output)
  File "/shared/bin/lib/python3.6/site-packages/snakemake/dag.py", line 323, in check_and_touch_output
    "wait time with --latency-wait.", rule=job.rule)
snakemake.exceptions.MissingOutputException: Missing files after 20 seconds:
/scratch/SRR2135300_2.fastq.gz
This might be due to filesystem latency. If that is the case, consider to increase the wait time with --latency-wait.

这与此处报告的错误类似: https ://bitbucket.org/snakemake/snakemake/issues/462/unhandled-missingoutputexception-in

Snakemake脚本如下:

   rule all:
           input:expand("/shared/dbGAP/sras2/fastq.gz/{sample}_{end}.fastq.gz", 
  sample=SAMPLES, end=END)

rule move:
    input: left="/scratch/{sample}_1.fastq.gz", right="/scratch/{sample}_2.fastq.gz"
    output: left="/shared/dbGAP/sras2/fastq.gz/{sample}_1.fastq.gz", right="/shared/dbGAP/sras2/fastq.gz/{sample}_2.fastq.gz"
    shell: "rsync --remove-source-files  -av {input.left} {output.left}; rsync --remove-source-files  -av {input.right} {output.right};"


rule get_fastq_files_from_sra_file:
    input: sras="/shared/dbGAP/sras2/test/{sample}.sra"
    output: left="/scratch/{sample}_1.fastq.gz", right="/scratch/{sample}_2.fastq.gz"
    shell: "/shared/dbGAP/sra_toolkit/sratoolkit.2.8.2-1-ubuntu64/bin/fastq-dump  --split-files --gzip  --outdir /scratch/   {input}"

我的感觉是,snakemake 无法“看到”节点上的划痕,因此将其返回为缺失,但我不知道如何解决这个问题。

4

0 回答 0