我有变量
StreamReader DebugInfo = GetDebugInfo();
var text = DebugInfo.ReadToEnd(); // takes 10 seconds!!! because there are a lot of students
文本等于:
<student>
<firstName>Antonio</firstName>
<lastName>Namnum</lastName>
</student>
<student>
<firstName>Alicia</firstName>
<lastName>Garcia</lastName>
</student>
<student>
<firstName>Christina</firstName>
<lastName>SomeLattName</lastName>
</student>
... etc
.... many more students
我现在在做什么是:
StreamReader DebugInfo = GetDebugInfo();
var text = DebugInfo.ReadToEnd(); // takes 10 seconds!!!
var mtch = Regex.Match(text , @"(?s)<student>.+?</student>");
// keep parsing the file while there are more students
while (mtch.Success)
{
AddStudent(mtch.Value); // parse text node into object and add it to corresponding node
mtch = mtch.NextMatch();
}
整个过程大约需要 25 秒。将 streamReader 转换为var text = DebugInfo.ReadToEnd();
需要 10 秒的文本 ( )。另一部分大约需要 15 秒。我希望我能同时做这两个部分...
编辑
我想要类似的东西:
const int bufferSize = 1024;
var sb = new StringBuilder();
Task.Factory.StartNew(() =>
{
Char[] buffer = new Char[bufferSize];
int count = bufferSize;
using (StreamReader sr = GetUnparsedDebugInfo())
{
while (count > 0)
{
count = sr.Read(buffer, 0, bufferSize);
sb.Append(buffer, 0, count);
}
}
var m = sb.ToString();
});
Thread.Sleep(100);
// meanwhile string is being build start adding items
var mtch = Regex.Match(sb.ToString(), @"(?s)<student>.+?</student>");
// keep parsing the file while there are more nodes
while (mtch.Success)
{
AddStudent(mtch.Value);
mtch = mtch.NextMatch();
}
编辑 2
概括
我忘了提抱歉,文本与 xml 非常相似,但事实并非如此。这就是我必须使用正则表达式的原因......简而言之,我认为我可以节省时间,因为我正在做的是将流转换为字符串然后解析字符串。为什么不用正则表达式解析流。或者,如果这不可能,为什么不获取流的一部分并在单独的线程中解析该块。