-1

对于小目录大小的代码工作正常,当目录文件大小很大时它会给出此错误消息。

我的代码:

IEnumerable<string> textLines = 
          Directory.GetFiles(@"C:\Users\karansha\Desktop\watson_query\", "*.*")
                   .Select(filePath => File.ReadAllLines(filePath))
                   .SelectMany(line => line)
                   .Where(line => !line.Contains("appGUID: null"))
                   .ToList();

List<string> users = new List<string>();

textLines.ToList().ForEach(textLine =>
{
    Regex regex = new Regex(@"User:\s*(?<username>[^\s]+)");
    MatchCollection matches = regex.Matches(textLine);
    foreach (Match match in matches)
    {
        var user = match.Groups["username"].Value;
        if (!users.Contains(user)) 
            users.Add(user);
    }
});

int numberOfUsers = users.Count(name => name.Length <= 10);
Console.WriteLine("Unique_Users_Express=" + numberOfUsers);
4

2 回答 2

1

我会使用Directory.EnumerateFiles,并且File.ReadLines由于它们的内存消耗较少,因此它们的工作方式类似于StreamReaderwhileDirectory.GetFilesFile.ReadAllLines首先将所有内容读入内存:

var matchingLines = Directory.EnumerateFiles(@"C:\Users\karansha\Desktop\watson_query\", "*.*")
    .SelectMany(fn => File.ReadLines(fn))
    .Where(l => l.IndexOf("appGUID: null", StringComparison.InvariantCultureIgnoreCase) >= 0);
foreach (var line in matchingLines)
{
    Regex regex = new Regex(@"User:\s*(?<username>[^\s]+)");
    // etc pp ...
}

您也不需要List<string>再次为所有行创建。只需使用foreach(textLines.ToList创建第三个集合,这也是多余的) 枚举查询。

于 2013-04-03T10:05:08.413 回答
0

尝试使用下一个代码,它使用ReadLines,它不会将整个文件加载到内存中,而是逐行读取文件。它还用于HashSet存储匹配正则表达式的唯一结果。

Regex regex = new Regex(@"User:\s*(?<username>[^\s]+)");
IEnumerable<string> textLines = 
      Directory.GetFiles(@"C:\Users\karansha\Desktop\watson_query\", "*.*")
               .Select(filePath => File.ReadLines(filePath))
               .SelectMany(line => line)
               .Where(line => !line.Contains("appGUID: null"));

HashSet<string> users = new HashSet<string>(
    textLines.SelectMany(line => regex.Matches(line).Cast<Match>())
             .Select(match => match.Groups["username"].Value)
);

int numberOfUsers = users.Count(name => name.Length <= 10);
Console.WriteLine("Unique_Users_Express=" + numberOfUsers);
于 2013-04-03T10:05:48.783 回答