8

我对 shell 脚本很陌生,我整天都在努力弄清楚如何执行“for”命令。本质上,我想做的是以下几点:

我有一个包含一堆名称的 list.txt 文件:

name1
name2
name3

对于列表中的每个名称,都有两个不同的文件,每个文件的名称都有不同的结尾。前任:

name1_R1
name1_R2

我试图运行的程序被称为sickle. 基本上,它需要两个文件(彼此对应)并对它们进行分析,因此需要我有这个命名方案。镰刀命令如下:

sickle pe -f input_file1.fastq -r input_file2.fastq -t sanger \

如果有人可以帮助我,至少只是告诉我如何让 unix 读取文件列表并独立处理每一行,我想我可以从那里开始。我尝试了几件事,但都没有奏效。

4

3 回答 3

16

There are a couple of ways to do it. Since the names are 'one per line' in the data file, we can assume there are no newlines in the file names.

for loop

for file in $(<list.txt)
do
    sickle pe -f "${file}_file1.fastq" -r "${file}_file2.fastq" -t sanger
done

while loop with read

while read file
do
    sickle pe -f "${file}_file1.fastq" -r "${file}_file2.fastq" -t sanger
done < list.txt

The for loop only works if there are no blanks in the names (nor other white-space characters such as tabs). The while loop is clean as long as you don't have newlines in the names, though using while read -r file would give you even better protection against the unexpected. The double quotes around the file name in the for loop are decorative (but harmless) because the file names cannot contain blanks, but those in the while loop prevent file names containing blanks from being split when they should not be split. It's often a good idea to quote variables every time you use them, though it strictly only matters when the variable might contain blanks but you don't want the value split up.

I've had to guess what names should be passed to the sickle command since your question is not clear about it — I'm 99% sure I've guessed wrong, but it matches the different suffixes in your sample command assuming the base name of file is input. I've omitted the trailing backslash; it is the 'escape' character and it is not clear what you really want there.

于 2013-08-03T02:39:57.507 回答
4

Use a Bash For-Loop

Bash has a very reasonable for-loop as one of its looping constructs. You can replace the echo command below with whatever custom command you want. For example:

for file in name1 name2 name3; do
  echo "${file}_R1" "${file}_R2"
done

The idea is that the loop assigns each filename to the file variable, then you append the _R1 and _R2 suffixes to them. Note that quoting may be important, and does no harm if it isn't needed, so you ought to use it as a defensive programming measure.

Use xargs for Argument Lists

If you want to read from a file instead of using the for-loop directly, you can use Bash's read builtin, but xargs is often more portable across shells. For example, the following uses flags available in the version of xargs from GNU findutils to read in arguments from a file and then append a suffix to each of them:

$ xargs --arg-file=list.txt --max-args=1 -I{} /bin/echo "{}_R1" "{}_R2"
name1_R1 name1_R2
name2_R1 name2_R2
name3_R1 name3_R2

Again, you can replace "echo" with the command line of your choice.

于 2013-08-03T02:36:44.370 回答
3

使用while循环read

while read fn; do
    <command> "${fn}_R1" "${fn}_R2"
done < list.txt
于 2013-08-03T02:30:19.937 回答