0

我正在尝试获取 GIAB 数据索引文件(它们是 CSV),并在 Nextflow 中下载每个文件。我认为我的总体结构是正确的,但是当我运行时nextflow run file.nf没有任何反应。

Channel.fromPath(file('https://raw.githubusercontent.com/genome-in-a-bottle/giab_data_indexes/master/NA12878/sequence.index.NA12878_Illumina_HiSeq_Exome_Garvan_trimmed_fastq_09252015'))
    .splitCsv(header: true)
    .map { it.FASTQ }
    .set { giab_urls }


process download_giab {
    storeDir 'giab'

    input:
        file giab_url from giab_urls

    output:
        file '*.fastq' into giab_fastqs

    script:
        """
        lftp -c 'get $giab_url'
        """
}

产生的日志文件如下:

Nov-13 18:18:43.537 [main] DEBUG nextflow.cli.Launcher - $> /opt/miniconda3/bin/nextflow run main.nf
Nov-13 18:18:43.653 [main] INFO  nextflow.cli.CmdRun - N E X T F L O W  ~  version 18.10.1
Nov-13 18:18:43.661 [main] INFO  nextflow.cli.CmdRun - Launching `main.nf` [agitated_cori] - revision: 5cf3310536
Nov-13 18:18:43.757 [main] DEBUG nextflow.Session - Session uuid: c19f86b4-0eff-43de-8ad4-cb7936701490
Nov-13 18:18:43.758 [main] DEBUG nextflow.Session - Run name: agitated_cori
Nov-13 18:18:43.759 [main] DEBUG nextflow.Session - Executor pool size: 4
Nov-13 18:18:43.769 [main] DEBUG nextflow.cli.CmdRun - 
  Version: 18.10.1 build 5003
  Modified: 24-10-2018 14:03 UTC (25-10-2018 01:03 AEDT)
  System: Linux 4.15.0-38-generic
  Runtime: Groovy 2.5.3 on OpenJDK 64-Bit Server VM 1.8.0_181-8u181-b13-1ubuntu0.18.04.1-b13
  Encoding: UTF-8 (UTF-8)
  Process: 8747@michael-Latitude-7480 [127.0.1.1]
  CPUs: 4 - Mem: 23.4 GB (1.9 GB) - Swap: 2 GB (2 GB)
Nov-13 18:18:43.832 [main] DEBUG nextflow.Session - Work-dir: /home/michael/Programming/CromwellValidation/work [ext2/ext3]
Nov-13 18:18:43.832 [main] DEBUG nextflow.Session - Script base path does not exist or is not a directory: /home/michael/Programming/CromwellValidation/bin
Nov-13 18:18:43.904 [main] DEBUG nextflow.Session - Session start invoked
Nov-13 18:18:43.911 [main] DEBUG nextflow.processor.TaskDispatcher - Dispatcher > start
Nov-13 18:18:43.911 [main] DEBUG nextflow.script.ScriptRunner - > Script parsing
Nov-13 18:18:44.244 [main] DEBUG nextflow.script.ScriptRunner - > Launching execution
Nov-13 18:18:44.586 [main] DEBUG nextflow.processor.ProcessFactory - << taskConfig executor: null
Nov-13 18:18:44.586 [main] DEBUG nextflow.processor.ProcessFactory - >> processorType: 'local'
Nov-13 18:18:44.593 [main] DEBUG nextflow.executor.Executor - Initializing executor: local
Nov-13 18:18:44.596 [main] INFO  nextflow.executor.Executor - [warm up] executor > local
Nov-13 18:18:44.600 [main] DEBUG n.processor.LocalPollingMonitor - Creating local task monitor for executor 'local' > cpus=4; memory=23.4 GB; capacity=4; pollInterval=100ms; dumpInterval=5m
Nov-13 18:18:44.604 [main] DEBUG nextflow.processor.TaskDispatcher - Starting monitor: LocalPollingMonitor
Nov-13 18:18:44.605 [main] DEBUG n.processor.TaskPollingMonitor - >>> barrier register (monitor: local)
Nov-13 18:18:44.616 [main] DEBUG nextflow.executor.Executor - Invoke register for executor: local
Nov-13 18:18:44.672 [main] DEBUG nextflow.Session - >>> barrier register (process: download_giab)
Nov-13 18:18:44.676 [main] DEBUG nextflow.processor.TaskProcessor - Creating operator > download_giab -- maxForks: 4
Nov-13 18:18:44.736 [main] DEBUG nextflow.script.ScriptRunner - > Await termination 
Nov-13 18:18:44.736 [main] DEBUG nextflow.Session - Session await
Nov-13 18:18:44.758 [Actor Thread 3] DEBUG nextflow.Session - <<< barrier arrive (process: download_giab)
Nov-13 18:18:44.759 [main] DEBUG nextflow.Session - Session await > all process finished
Nov-13 18:18:44.813 [Task monitor] DEBUG n.processor.TaskPollingMonitor - <<< barrier arrives (monitor: local)
Nov-13 18:18:44.813 [main] DEBUG nextflow.Session - Session await > all barriers passed
Nov-13 18:18:44.818 [main] DEBUG nextflow.trace.StatsObserver - Workflow completed > WorkflowStats[succeedCount=0; failedCount=0; ignoredCount=0; cachedCount=0; succeedDuration=0ms; failedDuration=0ms; cachedDuration=0ms]
Nov-13 18:18:44.826 [main] DEBUG nextflow.CacheDB - Closing CacheDB done
Nov-13 18:18:44.842 [main] DEBUG nextflow.script.ScriptRunner - > Execution complete -- Goodbye

有什么想法我在这里做错了吗?nextflow 的输出都不是很有启发性。

4

1 回答 1

1

需要使用函数将 fastq 路径字符串映射到文件对象,file例如:

Channel.fromPath('https://raw.githubusercontent.com/genome-in-a-bottle/giab_data_indexes/master/NA12878/sequence.index.NA12878_Illumina_HiSeq_Exome_Garvan_trimmed_fastq_09252015')
    .splitCsv(header: true, sep:'\t')
    .map { file(it.FASTQ) }
    .set { giab_urls }

另请注意,您需要指定sep处理 TAB 分隔文件的选项,并且在file将 url 传递给 fromPath 方法时不需要该函数。

您可以在此处找到此用例的描述。

于 2018-11-13T16:26:13.237 回答