bash - 如何连接来自不同文件的单词

Question

我是在 bash 中学习 shell 脚本的初学者。

我有两个不同的文件：

专业人士.txt

ProID:ProName:CSLocal:n1:n2

12345:John Joe:CSBerlin:0:0
98765:Miller Key:CSMoscow:0:1

和

人物.txt

人名：人名：年龄：本地：电话：n3

10001:Greg Linn:86:Berlin:912345678:0
10002:Peter Ronner:65:London:962345678:0
10003:Kelly Sena:91:Moscow:966645678:0
10004:Anne Tyler:87:Moscow:984973897:0

我需要做一个脚本，所以我得到一个输出文件，如：

输出.txt

ProName:ProID:personName:personID:CSLocal

personName 将对应于与 Professional 在同一城市的人

Miller Key:98765:Kelly Sena:10003:CSMoscow
Miller Key:98765:Anne Tyler:10004:CSMoscow

问候。

score 1 · Accepted Answer

join -t: -1 3 -2 4 -o1.2,1.1,2.2,2.1,2.4 \
    <(sort -t: -k3,3 Professionals.txt ) \
    <(sort -t: -k4,4 People.txt | sed 's/^\(\([^:]*:\)\{3\}\)/\1CS/')

join完全符合您的需要：它匹配基于给定列的两个列表。但它需要在列上对列表进行排序，所以这就是其余代码的作用。
-t为排序和连接指定列分隔符
-1并-2告诉join加入各个列表中的哪些列
-k告诉sort要排序的列，3,3表示“仅使用第 3 列”
-o告诉join要输出哪些列
sed用于将CS前缀添加到 People.txt 列表中的城市，以便名称在两个列表中匹配

score 0 · Accepted Answer

使用 GNU awk：

awk -F: 'FNR==NR { map[$4][$2]=$2":"$1;next } { for ( i in map[substr($3,3)] ) { print $2":"$1":"map[substr($3,3)][i]":"$3 } }' People.txt Professionals.txt

解释：

awk -F: 'FNR==NR {                                                   # Process the first file (People.txt)
                  map[$4][$2]=$2":"$1;                               # Build a two dimensional array with the city as the first index and the name as the second. Have the name and the Id as the value
                  next 
                } 
                {                                                    # Process the second file
                  for ( i in map[substr($3,3)] ) { 
                    print $2":"$1":"map[substr($3,3)][i]":"$3        # Loop through the array where the first index is equal to the 3rd ":" separated field from the 3rd character onwards of the existing line, printing the data along with additional data from the existing line.
                  } 
                 }' People.txt Professionals.txt

bash - 如何连接来自不同文件的单词

2 回答 2

Related

Reference