2

我有一个与上一篇文章非常相似的问题: 在 unix 中通过单列合并两个文件, 但我想基于两列合并我的数据(顺序相同,因此无需排序)。例子,

subjectid subID2 姓名 年龄
12 121 Jane 16
24 241 Kristen 90
15 151 Clarke 78
23 231 Joann 31

subjectid subID2 prob_disease
12 121 0.009
24 241 0.738
15 151 0.392
23 231 1.2E-5

输出看起来像

subjectid SubID2 prob_disease name 年龄
12 121 0.009 Jane 16
24 241 0.738 Kristen 90
15 151 0.392 Clarke 78
23 231 1.2E-5 Joanna 31

当我使用 join 时,它只考虑第一列(subjectid)并重复 SubID2 列。请问有没有办法通过join或其他方式做到这一点?谢谢

4

2 回答 2

2

如果订单相同,您仍然可以按单列合并并指定要输出的列的格式,例如:

join -o '1.1 1.2 2.3 1.3 1.4' file_a file_b

join(1)中所述。

于 2013-04-05T18:23:38.757 回答
2

join command doesn't have an option to scan more than one field as a joining criteria. Hence, you will have to add some intelligence into the mix. Assuming your files has a FIXED number of fields on each line, you can use something like this:

join f1 f2 | awk '{print $1" "$2" "$3" "$4" "$6}'

provided the the field counts are as given in your examples. Otherwise, you need to adjust the scope of print in the awk command, by adding or taking away some fields.

于 2013-04-05T18:10:38.817 回答