我目前正在重新构建一个相当庞大的数据库,并且我想加入 3 个具有半匹配内容的表,我有几组这些表,但它们都以三组的形式出现。情况如下:
--注意所有表格都是ascii格式,空格分隔---
T1_01= 表 1 =
1 + 'stuff1' + additional content 1 (where additional content only sometimes available)
2 ""
3 ""
....400
T1_02= 表 2 =
1 + "different stuff" + additional content 2
2 ""
3 ""
... 400
T1_03 = 表 3 =
5 cols yet other stuff + 001 + additional content 3
5 cols yet other stuff + 003 ""
5 cols yet other stuff + 007 ""
...
5 cols yet other stuff + 399 some rows are skipped, varies which ones
5 cols yet other stuff + 400
我想要的是,对于每个“组”,我有 3 个表,因为这些表以方便的方式分组,即 T1_01、T1_02、T1_03 将是第 1 组的表 1、2、3,然后是 T2_01、T2_02、T2_03。我总共需要这样做大约 60 次,我希望的表格输出是:
T1_0123=
1 + 'stuff1' + additional content 1 1 + "different stuff" + additional content 2 5 cols yet other stuff + 001 + additional content 3
2 + 'stuff1' + additional content 1 2 + "different stuff" + additional content 2 "something to fill in the empty spaces, like a set of -99.9 values"
3 + 'stuff1' + additional content 1 3 + "different stuff" + additional content 2 5 cols yet other stuff + 003 + additional content 3
...
400 ""
现在我做了一个初步的运行
join -1 1 -2 1 T1_01 T1_02 > T1_012
效果很好,但只有前两个和
join -1 1 -2 6 T1_01 T1_03
...不起作用,因为 001 不是 1
我希望一次运行所有 3 个表,然后执行类似
sed something awk $(cat list_of_T01) $(cat list_of_T02) $(cat list_of_T03)
批处理作业的操作。我一直在学习 python,所以这也可能在那里,但我肯定 AWK 更容易?欢迎任何建议。