这并不难,因为您只需要基于日期和用户名的计数,我可能会避开许多尝试完全解析格式的更复杂的解决方案。一个简单的基于正则表达式的解决方案就足够了:
var loginInfo =
// Read the lines in the file, one by one
File.ReadLines(args[0])
// Get a match with appropriate groups for the individual parts
.Select(l =>
Regex.Match(l,
@"Username:(?<username>[^|]+)\|\|
UserId:(?<userid>[^|]+)\|\|
Userlogintime:(?<date>\S+)", RegexOptions.IgnorePatternWhitespace))
// Create a new object with the user name and date
.Select(m => new {
Username = m.Groups["username"].Value,
Date = DateTime.Parse(m.Groups["date"].Value, CultureInfo.GetCultureInfo("en-us"))
})
// Group by itself, that is, collapse all identical objects into the same group
.GroupBy(i => i)
// Create a new object with user name, date and count
.Select(g => new {
Username = g.Key.Username,
Date = g.Key.Date,
Count = g.Count()
});
foreach (var info in loginInfo) {
Console.WriteLine("{0} {1} {2}", info.Username, info.Date, info.Count);
}
对于我来说,这会在稍微扩展的数据集上产生以下输出:
Rajini 2013-10-19 00:00:00 2
Test 2013-10-19 00:00:00 1
Rajini 2013-10-20 00:00:00 1
Test 2013-10-20 00:00:00 3
Rajini 2013-10-21 00:00:00 1