2

我在读取网络驱动器上的大文件(~400 mb)时遇到了一个有趣的问题。最初,我将完整的网络地址输入 FileInfo 并使用 CopyTo 函数将其传输到本地临时驱动器,然后读取它。这似乎工作正常,它不慢但不快 - 只是嗯。CopyTo 功能将使运行该程序的计算机的网络利用率始终保持在 50% 以上,这非常好。

为了加快这个过程,我尝试将网络文件直接读入内存流中,可以说是切断了中间人。当我尝试这个(使用这里描述的异步复制模式)时,速度非常慢。我的网络利用率甚至从未超过 2% - 几乎就像是有什么东西在限制我。仅供参考,我在通过 Windows 资源管理器直接复制同一文件时观察了我的网络利用率,它达到了 80-90%……不知道这里发生了什么。下面是我使用的异步复制代码:

string line;
List<string> results = new List<string>();

Parser parser = new Parser(QuerySettings.SelectedFilters, QuerySettings.SearchTerms,
QuerySettings.ExcludedTerms, QuerySettings.HighlightedTerms);

byte[] ActiveBuffer = new byte[60 * 1024];
byte[] BackBuffer = new byte[60 * 1024];
byte[] WriteBuffer = new byte[60 * 1024];

MemoryStream memStream = new MemoryStream();
FileStream fileStream = new FileStream(fullPath, FileMode.Open, FileSystemRights.Read, FileShare.None, 60 * 1024, FileOptions.SequentialScan);

int Readed = 0;
IAsyncResult ReadResult;
IAsyncResult WriteResult;

ReadResult = fileStream.BeginRead(ActiveBuffer, 0, ActiveBuffer.Length, null, null);
do
{
    Readed = fileStream.EndRead(ReadResult);

    WriteResult = memStream.BeginWrite(ActiveBuffer, 0, Readed, null, null);
    WriteBuffer = ActiveBuffer;

    if (Readed > 0)
    {
        ReadResult = fileStream.BeginRead(BackBuffer, 0, BackBuffer.Length, null, null);
        BackBuffer = Interlocked.Exchange(ref ActiveBuffer, BackBuffer);
    }

    memStream.EndWrite(WriteResult);
}
while (Readed > 0);

StreamReader streamReader = new StreamReader(memStream);
while ((line = streamReader.ReadLine()) != null)
{
    if (parser.ParseResults(line))
    results.Add(line);
}

fileStream.Flush();
fileStream.Close();

memStream.Flush();
memStream.Close();

return results;

更新 根据评论,我刚刚尝试了以下内容。我的网络利用率只有 10-15% 左右……为什么这么低?

MemoryStream memStream = new MemoryStream();
FileStream fileStream = File.OpenRead(fullPath);

fileStream.CopyTo(memStream);

memStream.Seek(0, 0);
StreamReader streamReader = new StreamReader(memStream);

Parser parser = new Parser(QuerySettings.SelectedFilters, QuerySettings.SearchTerms,
QuerySettings.ExcludedTerms, QuerySettings.HighlightedTerms);

while ((line = streamReader.ReadLine()) != null)
{
if (parser.ParseResults(line))
results.Add(line);
}
4

3 回答 3

4

I'm late to the party, but having had the same problem of low network utilization recently, after trying a lot of different implementations if found at last that a StreamReader with a large buffer (1MB in my case) increased the network utilization to 99%. None of the other options did make a significant change.

于 2014-08-20T20:03:05.127 回答
1

复制整个文件然后解析它是没有意义的。只需从网络驱动器打开文件,让 .Net Framework 尽最大努力为您提供数据。你可以比 MS 开发人员更聪明,你可以比他们更快地创建一个复制方法,但这确实是一个挑战。

于 2012-06-01T09:11:34.887 回答
1

使用 Reflector,我看到您的调用是:

FileStream fileStream = File.OpenRead(fullPath);

最终使用大小为 4096 字节( 0x1000 )的缓冲区。

public FileStream(string path, FileMode mode, FileAccess access, FileShare share) : this(path, mode, access, share, 0x1000, FileOptions.None, Path.GetFileName(path), false)
{
}

您可以尝试显式调用FileStream构造函数之一,并指定更大的缓冲区大小和FileOption.SequentialScan

不确定这会有所帮助,但很容易尝试。

于 2012-06-01T20:14:58.217 回答