3

word是指任何以空格分隔的字符串。

假设文件test.txt有以下由空格分隔的单词:

hello hello hello hell osd
hello
hello 
hello
hellojames beroo helloooohellool axnber hello
way
how 

我想计算单词hello在每一行中出现的次数。

我使用该命令显示每行awk -F "hello" '{print NF-1}' test.txt中单词hello的出现次数:

3
1
1
1
4
0
0

所以它总共找到 3+1+1+1+4 = 10 次出现。

问题出在第四行:hello作为一个单独的词只出现了 1 次;hellojameshellooohellool等词不应被计算在内,因为hello不是由空格分隔的。

所以我希望它找到 7 次出现的hello作为单独的单词。

你能帮我写一个返回正确总数 7 次的命令吗?

4

7 回答 7

6
awk '{ for(i=1; i<=NF; i++) if($i=="hello") c++ } END{ print c }' file.txt

如果您需要它打印每一行:

awk '{ c=1; for(i=0; i<=NF; i++) if($i=="hello") c++; print c }'
于 2012-05-15T00:56:34.923 回答
3
grep -o '\<hello\>' filename | wc -l

和位是字边界模式\<\>因此表达式不会找到foohelloor hellobar

你也可以使用awk -F '\\<hello\\>' ...来达到同样的效果。

于 2012-05-15T02:14:25.117 回答
2

Solution:

sed 's/\s\+/\n/g' test.txt | grep -w hello  | wc -l

Explanation:

sed 's/\s\+/\n/g' text.txt

This replaces every span of whitespace with a newline, effectively reformatting the file test.txt so it has one word per line. The command sed 's/FIND/REPLACE/g' replaces the FIND pattern with REPLACE everywhere it appears. The pattern \s\+ means "one or more whitespace characters", and \n is a newline.

grep -w hello

This extracts only those lines that contain hello as a complete word.

wc -l

This counts the number of lines.


If you want to count the number of occurrences per line, you can use the same technique, but process one line at a time:

while read line; do
  echo $line | sed 's/\s\+/\n/g' | grep -w hello  | wc -l
done < test.txt
于 2012-05-15T01:52:41.087 回答
0
cat $FileName | tr '[\040]' '[\012]' | grep $word | wc -l

此命令将更改新行中的空格,然后您可以轻松地 grep 该单词并计算包含给定单词的行数。

于 2013-08-06T13:07:35.817 回答
0
a=$(printf "\01")
b=hello
sed -e "s/\<$b\>/ $a /g" -e "s/[^$a]//g" -e "s/$a/ $b /g" file | wc -w
于 2012-05-15T01:39:16.123 回答
0
for word in `cat test.txt`; do
  if [[ ${word} == hello ]]; then
    helloCount=$(( ${helloCount} + 1));
  fi;
done;

echo ${helloCount} 
于 2012-05-15T01:30:15.040 回答
0

只改变“针”和“文件”

#!/usr/bin/env sh

needle="|"
file="file_example.txt"

IFS=$'\n'

counter=0
for line in `cat $file`
do
    counter=$[$counter+1]
    echo $counter"|"`echo $line | grep -o "$needle" | wc -l`
done

它将打印行号和出现次数,由管道字符分隔

于 2014-01-15T16:17:44.143 回答