0

样本输入:

>Sample GJVT7LS03DEUKL
AAACTCCGCAATGCGCGCAAGC
>Sample GJVT7LS03CXJ53
AAACTCCGCAATGCGCGCAAGCGTGACGGGG
>Sample GJVT7LS03DJOYJ
AAACTCC
>Sample GJVT7LS03DMERH
AAACTCCGCAATGCGCGCAAGCGTGACGGGGGGAC
>Sample GJVT7LS03DN2RB
AAACTCCGCAATGCGCGCAAGCGTGACGG

我想要的:

>Sample_1 GJVT7LS03DEUKL
AAACTCCGCAATGCGCGCAAGC
>Sample_2 GJVT7LS03CXJ53
AAACTCCGCAATGCGCGCAAGCGTGACGGGG
>Sample_3 GJVT7LS03DJOYJ
AAACTCC
>Sample_4 GJVT7LS03DMERH
AAACTCCGCAATGCGCGCAAGCGTGACGGGGGGAC
>Sample_5 GJVT7LS03DN2RB
AAACTCCGCAATGCGCGCAAGCGTGACGG

换句话说,我想为与模式匹配的每一行(在本例中为“Sample”)附加计数(以“_”开头)。任何 sed/awk/等。这项任务的单线?

4

2 回答 2

4

单程:

$ awk '/^>/{$1=$1"_"++i}1' file
>Sample_1 GJVT7LS03DEUKL
AAACTCCGCAATGCGCGCAAGC
>Sample_2 GJVT7LS03CXJ53
AAACTCCGCAATGCGCGCAAGCGTGACGGGG
>Sample_3 GJVT7LS03DJOYJ
AAACTCC
>Sample_4 GJVT7LS03DMERH
AAACTCCGCAATGCGCGCAAGCGTGACGGGGGGAC
>Sample_5 GJVT7LS03DN2RB
AAACTCCGCAATGCGCGCAAGCGTGACGG
于 2013-08-15T12:57:50.980 回答
2

一种可能的尝试如下:

$ awk 'BEGIN{a=1}/Sample/ {$1=$1"_"a; a++}1' file
>Sample_1 GJVT7LS03DEUKL
AAACTCCGCAATGCGCGCAAGC
>Sample_2 GJVT7LS03CXJ53
AAACTCCGCAATGCGCGCAAGCGTGACGGGG
>Sample_3 GJVT7LS03DJOYJ
AAACTCC
>Sample_4 GJVT7LS03DMERH
AAACTCCGCAATGCGCGCAAGCGTGACGGGGGGAC
>Sample_5 GJVT7LS03DN2RB
AAACTCCGCAATGCGCGCAAGCGTGACGG

对于每个包含“Sample”的文件,我们将第一个字段更新为"_"$variable. 该变量a最初设置为 1,然后我们将其加一。

于 2013-08-15T12:53:37.260 回答