0

假设我有一个关键字列表

free numerology compatibility
numerology calculator free
free numerology report
numerology reading
free numerology reading
etc...

当我想得到以下结果时,我可以通过什么 c# 算法或它叫什么来进一步研究它?

6 instances of "numerology"
3 instances of "free numerology"
2 instances of "numerology reading"
1 instance of "numerology compatibility"
1 instance of "numerology calculator"
etc...
4

2 回答 2

0

您可以遍历单词数组并使用字典存储计数。

例如

Dictionary d = new Dictionary<string, int>();

foreach (string word in wordList)
{
    if (d.ContainsKey(word))
    {
       d[word]++;
    }
    else
    {
       d[word] = 1;
    }
}
于 2012-10-24T15:09:32.123 回答
0

您正在查看的主题名称为Term Frequency AnalysisWord Frequency Analysis。以下代码可以为您提供每个单词的频率。找到给定短语的频率也很容易,但是对整个文档进行分析并找到频率高于 1 的术语序列有点复杂。

void Analyze(ref String InputText, ref Dictionary<string, int> WordFreq)
{
    string []Words = InputText.Split(' ');

    for (int i = 0; i < Words.Length; i++)
    {
        if (WordFreq.ContainsKey(Words[i]) == false)
            WordFreq.Add(Words[i], 1);
        else
        {
             WordFreq[Words[i]]++;
        }
    }
}

void DoWork()
{
    string InputText = "free numerology compatibility numerology calculator free free numerology report numerology reading free numerology reading";
    Dictionary<string, int> WordFreq = new Dictionary<string,int>();

    Analyze(ref InputText,ref WordFreq);

    string result = null;
    foreach (KeyValuePair<string, int> pair in WordFreq)
    {
        result += pair.Value + " Instances of " + pair.Key + "\r\n";
    }

    MessageBox.Show(result);
}

private void Form1_Load(object sender, EventArgs e)
{
    DoWork();
}
于 2012-10-24T17:10:40.200 回答