6

My table file example looks like that

Name1   xxxxx  34
Name1   xxxxx  37
Name2   aaaaa  59
Name2   xxxxx  90
Name4   Name3  12

Name file looks like that

Name1 
Name2
Name3
Name4 

I want awk to match Name1/2/3/4 from Name file to table file $1 and print sum of $3. If Name is not found print 0 - how can I do such if statement in awk?

What I already done:

for i in $(cat Name_file)
do 
cat table | awk -v NAME="$i" '($1==NAME) {SUM+=$3} END {print NAME"\t"SUM}'
done

Gives output

Name1   71
Name2   149
Name3   
Name4   12

It's almost perfect - I want to add 0 to Name3 to get such output

Name1   71
Name2   149
Name3   0
Name4   12

So much question is: How to add if not found do function in awk?

4

2 回答 2

2

Y 不需要任何“未找到”行为。SUM在计数之前,您只是没有正确初始化变量。为此使用BEGIN {SUM = 0}

如果您明确需要找到/未找到的行为,请以类似方式执行。首先,初始化一些变量,BEGIN {FOUND = 0}然后在模式匹配上以某种方式改变它:(...) {FOUND = FOUND+1}最后用if(FOUND!=0).

于 2013-06-13T10:01:05.530 回答
1

像这样尝试 sg:

awk 'NR==FNR{a[$1]=0;next}$1 in a{a[$1]+=$3}END{for(i in a) print i,a[i]}' Name_file table

输出:

Name1 71
Name2 149
Name3 0
Name4 12

在这种情况下,您不需要围绕循环。它读取第一个,然后一步处理所有行。所以它更有效。Names_tabletable

添加

或纯 (>= 4.0) 解决方案:

printf -v tmp "[%s]=0 " $(<Name_file)
declare -A htmp
eval htmp=($tmp)
while read a b c; do [ -n "${htmp[$a]}" ] && ((htmp[$a] += $c)); done <table
for i in ${!htmp[*]}; do echo $i ${htmp[$i]}; done

扩展

扩展的问题是按$1and分组$2(并且Name_file包含来自 的所有第一个键table,因此实际上不需要处理)。

cat >table <<XXX
Name1   xxxxx  34
Name1   xxxxx  37
Name2   aaaaa  59
Name2   xxxxx  90
Name4   Name3  12
XXX

awk -v SUBSEP=, '{a[$1,$2]+=$3;++n[$1,$2]}END{for(i in a) print i,a[i],n[i]}' table

输出:

Name2,xxxxx 90 1
Name2,aaaaa 59 1
Name4,Name3 12 1
Name1,xxxxx 71 2
于 2013-06-13T10:01:40.897 回答