我已经阅读了一个文件并将它们拆分为一个单词数组:
file1 = File.open("spam1.txt","rb")
file1_contents = file1.read
file1 = file1_contents.split(' ')
我可以使用哈希计算单词的频率,并根据单词的频率对它们进行排序:
freqs1 = Hash.new(0)
file1.each { |word| freqs1[word] +=1}
freqs1 = freqs1.sort_by {|x,y| y}
freqs1.reverse!
也可以像这样向用户输出结果:
freqs.each{|word, freq| puts word + ' ' + freq.to_s}
file1
如果数组或哈希多次freqs1
包含某些单词,我想向用户显示一条消息。
我有一个(坏的)想法来遍历freqs1
哈希并向用户显示适当的消息:
freqs1.each{|word,freq|
if ((word == ('business' || 'fund' || 'funds' || 'account' ||'transfer' || 'money')) && freq > 2) || (word == 'Iraq' && freq > 1 ) then
puts "File 1 is suspected as spam mail - suspicious word frequency"
else
puts "File 1 does not appear to be spam email"
end
}
然而,这对我来说很愚蠢,因为这对hash
.
business, fund, funds, account
如果诸如etc 之类的词出现两次以上,如何向用户显示特定消息?
谢谢你的帮助。