-1

文件 1

a,b, c, d,session-111, e, f
p,f, y, j,session-222, e, o
p,e, c, j,session-333, e, r
t,y, u, j,session-444, r, r
t,y, u, j,session-555, e, w
e,g, m, j,session-555, e, m
e,e, m, j,session-555, e, m

文件 2

session-111, data-123, 123, erwt
session-222, data-234, 345, fghjf
session-333, data-345, 456, aasdf
session-555, data-567, 789, aasdf
session-555, data-890, 121, aasdf
session-666, data-678, 121, aasdf

输出

a,b, c, d,session-111, e, f, data-123, 123
p,f, y, j,session-222, e, o, data-234, 345
p,e, c, j,session-333, e, r, data-345, 456
t,y, u, j,session-444, e, r, NODATA
t,y, u, j,session-555, e, r, date-567, 789
t,y, u, j,session-555, e, r, date-890, 121
e,e, m, j,session-555, e, m, NODATA

应打印来自 file1 的所有数据 - 无论是否在 file2 中找到引用,如果在文件 2 中找到引用,则特定字段(字段 2 和 3)将在输出文件中连接

4

2 回答 2

1

如果我理解正确,您希望将 file1 中的字段 5 和 1 分别依次匹配到 file2,如果不匹配,则应使用“NODATA”字段。以下内容接近您想要的,我认为您列出的输出有一些错误,请参阅 sudo_O 的评论:

解析.awk

BEGIN { FS = OFS = "," }
FNR == NR { 
  lines[$1][++count[$1]] = $2 FS $3
  next
} 

count[$5] == 0 { print $0, " NODATA" } 
count[$5]  > 0 {
  count[$5]--
  print $0, lines[$5][++prn[$5]]
}

像这样运行它:

awk -f parse.awk file2 file1

输出:

a,b, c, d,session-111, e, f, data-123, 123
p,f, y, j,session-222, e, o, data-234, 345
p,e, c, j,session-333, e, r, data-345, 456
t,y, u, j,session-444, r, r, NODATA
t,y, u, j,session-555, e, w, data-567, 789
e,g, m, j,session-555, e, m, data-890, 121
e,e, m, j,session-555, e, m, NODATA
于 2013-02-14T09:36:41.647 回答
1

试试这个单行:

awk -F, 'NR==FNR{k[$1]=$2 OFS $3;next} {if($5 in k)print $0,k[$5];else print $0," NODATA"}'  OFS="," file2 file1
a,b, c, d,session-111, e, f, data-123, 123
p,f, y, j,session-222, e, o, data-234, 345
p,e, c, j,session-333, e, r, data-345, 456
t,y, u, j,session-444, r, r, NODATA
t,y, u, j,session-555, e, w, data-890, 121
e,g, m, j,session-555, e, m, data-890, 121
e,e, m, j,session-555, e, m, data-890, 121
于 2013-02-14T09:25:25.430 回答