0

我有两张桌子。表 1 样本有多个列,表 2 有一个列。我的问题是,如何根据表 1 中的值从表 1 中提取行。我想一个简单的 grep 应该可以工作,但我如何才能对每一行进行 grep。我希望输出保留匹配的表 2 标识符。

谢谢!

期望的输出:

IPI00004233 IPI00514755;IPI00004233;IPI00106646;    Q9BRK5-1;Q9BRK5-2;
IPI00001849 IPI00420049;IPI00001849;    Q5SV97-1;Q5SV97-2;
...
......

表格1:

IPI00436567;    Q6VEP3;
IPI00169105;IPI01010102;    Q8NH21;
IPI00465263;    Q6IEY1;
IPI00465263;    Q6IEY1;
IPI00478224;    A6NHI5;
IPI00853584;IPI00000733;IPI00166122;    Q96NU1-1;Q96NU1-2;
IPI00411886;IPI00921079;IPI00385785;    Q9Y3T9;
IPI01010975;IPI00418437;IPI01013997;IPI00329191;    Q6TDP4;
IPI00644132;IPI00844469;IPI00030240;    Q494U1-1;Q494U1-2;
IPI00420049;IPI00001849;    Q5SV97-1;Q5SV97-2;
IPI00966381;IPI00917954;IPI00028151;    Q9HCC6;
IPI00375631;    P05161;
IPI00374563;IPI00514026;IPI00976820;    O00468;
IPI00908418;    E7ERA6;
IPI00062955;IPI00002821;IPI00909677;    Q96HA4-1;Q96HA4-2;
IPI00641937;IPI00790556;IPI00889194;    Q6ZVT0-1;Q6ZVT0-2;Q6ZVT0-3;
IPI00001796;IPI00375404;IPI00217555;    Q9Y5U5-1;Q9Y5U5-2;Q9Y5U5-3;
IPI00515079;IPI00018859;    P43489;
IPI00514755;IPI00004233;IPI00106646;    Q9BRK5-1;Q9BRK5-2;
IPI00064848;    Q96L58;
IPI00373976;    Q5T7M4;
IPI00375728;IPI86;IPI00383350;  Q8N2K1-1;Q8N2K1-2;
IPI01022053;IPI00514605;IPI00514599;    P51172-1;P51172-2;

表 2:

IPI00000207
IPI00000728
IPI00000733
IPI00000846
IPI00000893
IPI00001849
IPI00002214
IPI00002335
IPI00002349
IPI00002821
IPI00003362
IPI00003419
IPI00003865
IPI00004233
IPI00004399
IPI00004795
IPI00004977
4

1 回答 1

1

您不能使用 grep 来添加针头,因此没有机会使用-f file2.

使用循环并手动添加:

while read token; do grep $token file1 |xargs -I{} echo $token {} ; done <file2

或者,您可以同时存储grepandgrep -opaste它们的结果:

grep -f 2.txt 1.txt >a
grep -of 2.txt 1.txt >b
paste b a

如果您也可以使用awk,请尝试以下操作:

awk 'FNR==NR { a[$0];next } { for (x in a) if ($0 ~ x) print x, $0 }' 2.txt 1.txt

说明:对于第一个文件(只要FNR==NR),将所有针存储到数组a{ a[$0];next })中。然后(隐式)遍历第二个文件的所有行,再次遍历所有针并打印针和线(如果找到)。

于 2013-09-16T23:20:22.290 回答