50

这是我的问题:从标准输入中给出了任意数量的文本行。输出:非重复行数

输入:

She is wearing black shoes.
My name is Johny.
I hate mondays.
My name is Johny.
I don't understand you.
She is wearing black shoes.

输出:

2
4

2 回答 2

114

您可以尝试使用 uniqman uniq并执行以下操作

sort file | uniq -u | wc -l
于 2013-05-01T22:38:10.463 回答
8

这是我解决问题的方法:

... | awk '{n[$0]++} END {for (line in n) if (n[line]==1) num++; print num}'

但这很不透明。这是一种(稍微)更清晰的查看方式(需要 bash 版本 4)

... | {
    declare -A count    # count is an associative array

    # iterate over each line of the input
    # accumulate the number of times we've seen this line
    #
    # the construct "IFS= read -r line" ensures we capture the line exactly

    while IFS= read -r line; do
        (( count["$line"]++ ))
    done

    # now add up the number of lines who's count is only 1        
    num=0
    for c in "${count[@]}"; do
        if (( $c == 1 )); then
            (( num++ ))
        fi
    done

    echo $num
}
于 2013-05-01T23:13:32.773 回答