python - 在 nextflow 中使用 bash 修改 Python 脚本输出

Question

我有一个 python 脚本（make_chunk.py），它从输入通道获取输入文件并打印 3 个数组。

import pandas as pd
import numpy as np
import os
import sys

data=sys.argv[1]
df=pd.read_csv(data,sep='\t',header=None)
chnk_ult=df[df.columns[3]].max()

chnk_start=np.arange(0,chnk_ult,3000000)
chnk_end=chnk_start+3e6
chnk_arr=np.arange(1,len(chnk_end))
print(chnk_start, chnk_end, chnk_arr)

我想从上面的输出中创建 3 个不同的 bash 数组。在终端是可行的。我想在 nextflow 脚本中使用相同的命令来创建稍后将使用的那些数组。到目前为止，我已经尝试过：

process imputation {
publishDir params.out, mode:'copy'
input:
tuple val(chrom),path(in_haps),path(input_bed),path(refs),path(maps) from imp_ch
output:
tuple("${chrom}"),path("${chrom}.*") into imputed
script:
def (haps,sample)=in_haps
def (bed, bim, fam)=input_bed
def (haplotype, legend, samples)=refs
"""
x="\$(make_chunk.py ${bim})"
eval \$(echo \$x | sed 's|,| |g; s|\\[|list1=(|; s|\\[|list2=(|; s|\\[|list3=(|;s|\\]|)\\n|g;')
start="\$(echo \${list1[@]})"
end="\$(echo \${list2[@]})"
chunks="\$(echo \${list3[@]})"
impute4 -g "${haps}" -h "${haplotype}" -l "${legend}" -m "${maps}" -o "${chrom}.step10.imputed.chunk\${chunks}" -no_maf_align -o_gz -int \${start[\${chunks}]} \${end[\${chunks}]} -Ne 20000 -buffer 1000 -seed 54321
"""
}

对于上面的 nextflow 过程，我收到以下错误：

Command error: .command.sh: line 7: 0 1 2 3 4 5 6: syntax error in expression (error token is "1 2 3 4 5 6"

但是在 bash 终端中，这些命令可以正常工作。对这件事有什么帮助吗？

score 0 · Accepted Answer

如果您的 bimfile 只是一个空格分隔的文件，请使用nextflow 运算符来拆分此类文件：

python - 在 nextflow 中使用 bash 修改 Python 脚本输出

1 回答 1

Related

Reference