我有一个输入文件,其中包含:
123,apple,orange
123,pineapple,strawberry
543,grapes,orange
790,strawberry,apple
870,peach,grape
543,almond,tomato
123,orange,apple
我希望输出为:重复以下数字:123 543
有没有办法使用 awk 获得这个输出?我正在用 solaris bash 编写脚本
sed -e 's/,/ , /g' <filename> | awk '{print $1}' | sort | uniq -d
awk -vFS=',' \
'{KEY=$1;if (KEY in KEYS) { DUPS[KEY]; }; KEYS[KEY]; } \
END{print "Repeated Keys:"; for (i in DUPS){print i} }' \
< yourfile
也有 sort/uniq/cut 的解决方案(见上文)。
如果你可以不使用 awk,你可以使用它来获取重复的数字:
cut -d, -f 1 my_file.txt | sort | uniq -d
印刷
123
543
编辑:(回应您的评论)
您可以缓冲输出并决定是否要继续。例如:
out=$(cut -d, -f 1 a.txt | sort | uniq -d | tr '\n' ' ')
if [[ -n $out ]] ; then
echo "The following numbers are repeated: $out"
exit
fi
# continue...
此脚本将仅打印重复多次的第一列的编号:
awk -F, '{a[$1]++}END{printf "The following numbers are repeated: ";for (i in a) if (a[i]>1) printf "%s ",i; print ""}' file
或者更短的形式:
awk -F, 'BEGIN{printf "Repeated "}(a[$1]++ == 1){printf "%s ", $1}END{print ""} ' file
如果您想在找到 dup 时退出脚本,则可以退出非零退出代码。例如:
awk -F, 'a[$1]++==1{dup=1}END{if (dup) {printf "The following numbers are repeated: ";for (i in a) if (a[i]>1) printf "%s ",i; print "";exit(1)}}' file
在您的主脚本中,您可以执行以下操作:
awk -F, 'a[$1]++==1{dup=1}END{if (dup) {printf "The following numbers are repeated: ";for (i in a) if (a[i]>1) printf "%s ",i; print "";exit(-1)}}' file || exit -1
或者以更易读的格式:
awk -F, '
a[$1]++==1{
dup=1
}
END{
if (dup) {
printf "The following numbers are repeated: ";
for (i in a)
if (a[i]>1)
printf "%s ",i;
print "";
exit(-1)
}
}
' file || exit -1