1

我有一个看起来像这样的文件:

@HISEQ:331:C85AMANXX:8:1101:16636:1980 1:N:0:ATCACGAC
NTCTATAAACTCTTCATGCCAGTTCCCTGTCTCATCAGATAGATTCTGAGGCCTCTAGGCATCAGCCGGATATCCCTAAGGACAGTGTTGGAGGAACTGCTGAGTGGATTCATGGTCAACTACCAA
+
#<<BBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFBFFFFFFFFFFFFFFFFFFFFF
@HISEQ:331:C85AMANXX:8:1101:2337:2047 1:N:0:ATCACGAC
CTGTGAAAACTCTTCATGCCAGTTCCCTGTCTCATCAGATAGATTCTGAGGCCTCTAGGCATCAGCCGGATATCCCTAAGGACAGTGTTGGAGGAACTGCTGAGTGGATTCATGGTCAACTACCAA
+
BBBBBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFBFBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFBFFFFFFFFFFFFFFFFFFFFFF<FFF<FF<BFFFF<FFFFBFFFBFFFFF<B

我正在使用以下 grep 命令:

grep -B 1 -A 2 'AGGCATCAGCCGGA' file.fastq | head > out.fastq

输出看起来像这样,您可以在第 5 行和第 10 行看到输出了两个破折号,我不希望这样:

@HISEQ:331:C85AMANXX:8:1101:16636:1980 1:N:0:ATCACGAC
NTCTATAAACTCTTCATGCCAGTTCCCTGTCTCATCAGATAGATTCTGAGGCCTCTAGGCATCAGCCGGATATCCCTAAGGACAGTGTTGGAGGAACTGCTGAGTGGATTCATGGTCAACTACCAA
+
#<<BBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFBFFFFFFFFFFFFFFFFFFFFF
--
@HISEQ:331:C85AMANXX:8:1101:2337:2047 1:N:0:ATCACGAC
CTGTGAAAACTCTTCATGCCAGTTCCCTGTCTCATCAGATAGATTCTGAGGCCTCTAGGCATCAGCCGGATATCCCTAAGGACAGTGTTGGAGGAACTGCTGAGTGGATTCATGGTCAACTACCAA
+
BBBBBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFBFBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFBFFFFFFFFFFFFFFFFFFFFFF<FFF<FF<BFFFF<FFFFBFFFBFFFFF<B
--

有没有办法在第 5 行和第 10 行不带破折号输出?

4

3 回答 3

5

默认情况grep下,通过分隔符分隔上下文组--。一个块中可能有多个匹配项,因此行数不是恒定的。分隔符将显示块的开始和结束位置。

如果您--no-group-separatorgrep.

于 2016-03-11T17:15:31.120 回答
2

使用grep -v

grep -B 1 -A 2 'AGGCATCAGCCGGA' file.fastq |
grep -v -- '^--' |
head > out.fastq
于 2016-03-11T17:15:03.510 回答
1

您可以向 sed 添加管道:

grep -B 1 -A 2 'AGGCATCAGCCGGA' file.fastq | head | sed '/^--$/d'> out.fastq
于 2016-03-11T17:19:43.853 回答