2

我正在使用以下代码将字符串数组拆分为列表。

private List<string> GenerateTerms(string[] docs)
    {
        return docs.SelectMany(doc => ProcessDocument(doc)).Distinct().ToList();
    }

    private IEnumerable<string> ProcessDocument(string doc)
    {
        return doc.Split(' ')
                  .GroupBy(word => word)
                  .OrderByDescending(g => g.Count())
                  .Select(g => g.Key)
                  .Take(1000);
    }

我想要做的是将返回的列表替换为

Dictionary <string, int>

即而不是返回列表,我想返回字典

谁能帮忙??提前致谢。

4

4 回答 4

2
string doc = "This is a test sentence with some words with some words repeating like: is a test";
var result = doc.Split(' ')
                   .GroupBy(word => word)
                   .OrderByDescending(g=> g.Count())
                   .Take(1000)
                   .ToDictionary(r => r.Key ,r=> r.Count());

编辑:

我相信您正在寻找从字符串数组中获取最终字典,基于单词作为键并将它们的最终计数作为值。由于字典不能包含重复值,因此您不需要使用Distict. 您必须将方法重写为:

private Dictionary<string,int> GenerateTerms(string[] docs)
{
    List<Dictionary<string, int>> combinedDictionaryList = new List<Dictionary<string, int>>();
    foreach (string str in docs)
    {
        //Add returned dictionaries to a list
        combinedDictionaryList.Add(ProcessDocument(str));
    }
    //return a single dictionary from list od dictionaries
    return combinedDictionaryList
            .SelectMany(dict=> dict)
            .ToLookup(pair => pair.Key, pair => pair.Value)
            .ToDictionary(group => group.Key, group => group.Sum(value => value));
}

private Dictionary<string,int> ProcessDocument(string doc)
{
    return doc.Split(' ')
            .GroupBy(word => word)
            .OrderByDescending(g => g.Count())
            .Take(1000)
            .ToDictionary(r => r.Key, r => r.Count());
}

然后你可以这样称呼它:

string[] docs = new[] 
    {
        "This is a test sentence with some words with some words repeating like: is a test",
        "This is a test sentence with some words with some words repeating like: is a test",
        "This is a test sentence with some words",
        "This is a test sentence with some words",
    };

Dictionary<string, int> finalDictionary = GenerateTerms(docs);
于 2012-11-21T05:57:53.120 回答
1

尝试这个:

string[] docs = {"aaa bbb", "aaa ccc", "sss, ccc"};        

var result = docs.SelectMany(doc => doc.Split())
                 .GroupBy(word => word)
                 .OrderByDescending(g => g.Count())
                 .ToDictionary(g => g.Key, g => g.Count())
                 .Take(1000);

编辑:

var result = docs.SelectMany(
        doc => doc.Split()
            .GroupBy(word => word)
            .OrderByDescending(g => g.Count())
            .Take(1000))
    .Select(g => new {Word = g.Key, Cnt = g.Count()})
    .GroupBy(t => t.Word)
    .ToDictionary(g => g.Key, g => g.Sum(t => t.Cnt));
于 2012-11-21T06:28:17.093 回答
0

没有任何额外的麻烦,以下应该可以工作。

return doc.Split(' ')
          .GroupBy(word => word)
          .ToDictionary(g => g.Key, g => g.Count());

根据您的情况,通过 等对其进行定制TakeOrderBy

于 2012-11-21T05:54:43.583 回答
0

尝试这样的事情:

    var keys = new List<string>();
    var values = new List<string>();
    var dictionary = keys.ToDictionary(x => x, x => values[keys.IndexOf(x)]);
于 2012-11-21T05:55:15.437 回答