16

我正在尝试从公共 URL 下载一个大文件。起初它似乎工作正常,但 1 / 10 计算机似乎超时。我最初的尝试是使用WebClient.DownloadFileAsync,但因为它永远不会完成,我退回到WebRequest.Create直接使用和读取响应流。

我的第一个版本使用WebRequest.Create发现与WebClient.DownloadFileAsync. 操作超时,文件未完成。

如果下载超时,我的下一个版本添加了重试。这是它变得奇怪的地方。下载最终会以 1 次重试完成最后 7092 个字节。因此,文件以完全相同的大小下载,但文件已损坏并且与源文件不同。现在我希望损坏出现在最后 7092 个字节中,但事实并非如此。

使用 BeyondCompare 我发现损坏的文件中丢失了 2 个字节块,总共丢失了 7092 个字节!在下载超时并重新启动之前,此丢失的字节位于1CA49FF0and 。1E31F380

这里可能发生了什么?有关如何进一步追踪此问题的任何提示?

这是有问题的代码。

public void DownloadFile(string sourceUri, string destinationPath)
{
    //roughly based on: http://stackoverflow.com/questions/2269607/how-to-programmatically-download-a-large-file-in-c-sharp
    //not using WebClient.DownloadFileAsync as it seems to stall out on large files rarely for unknown reasons.

    using (var fileStream = File.Open(destinationPath, FileMode.Create, FileAccess.Write, FileShare.Read))
    {
        long totalBytesToReceive = 0;
        long totalBytesReceived = 0;
        int attemptCount = 0;
        bool isFinished = false;

        while (!isFinished)
        {
            attemptCount += 1;

            if (attemptCount > 10)
            {
                throw new InvalidOperationException("Too many attempts to download. Aborting.");
            }

            try
            {
                var request = (HttpWebRequest)WebRequest.Create(sourceUri);

                request.Proxy = null;//http://stackoverflow.com/questions/754333/why-is-this-webrequest-code-slow/935728#935728
                _log.AddInformation("Request #{0}.", attemptCount);

                //continue downloading from last attempt.
                if (totalBytesReceived != 0)
                {
                    _log.AddInformation("Request resuming with range: {0} , {1}", totalBytesReceived, totalBytesToReceive);
                    request.AddRange(totalBytesReceived, totalBytesToReceive);
                }

                using (var response = request.GetResponse())
                {
                    _log.AddInformation("Received response. ContentLength={0} , ContentType={1}", response.ContentLength, response.ContentType);

                    if (totalBytesToReceive == 0)
                    {
                        totalBytesToReceive = response.ContentLength;
                    }

                    using (var responseStream = response.GetResponseStream())
                    {
                        _log.AddInformation("Beginning read of response stream.");
                        var buffer = new byte[4096];
                        int bytesRead = responseStream.Read(buffer, 0, buffer.Length);
                        while (bytesRead > 0)
                        {
                            fileStream.Write(buffer, 0, bytesRead);
                            totalBytesReceived += bytesRead;
                            bytesRead = responseStream.Read(buffer, 0, buffer.Length);
                        }

                        _log.AddInformation("Finished read of response stream.");
                    }
                }

                _log.AddInformation("Finished downloading file.");
                isFinished = true;
            }
            catch (Exception ex)
            {
                _log.AddInformation("Response raised exception ({0}). {1}", ex.GetType(), ex.Message);
            }
        }
    }
}

这是损坏下载的日志输出:

Request #1.
Received response. ContentLength=939302925 , ContentType=application/zip
Beginning read of response stream.
Response raised exception (System.Net.WebException). The operation has timed out.
Request #2.
Request resuming with range: 939295833 , 939302925
Received response. ContentLength=7092 , ContentType=application/zip
Beginning read of response stream.
Finished read of response stream.
Finished downloading file.
4

4 回答 4

1

这是我通常使用的方法,到目前为止,对于您需要的相同类型的加载,它还没有让我失望。尝试使用我的代码对您的代码进行一些更改,看看是否有帮助。

if (!Directory.Exists(localFolder))
{
    Directory.CreateDirectory(localFolder);   
}


try
{
    HttpWebRequest httpRequest = (HttpWebRequest)WebRequest.Create(Path.Combine(uri, filename));
    httpRequest.Method = "GET";

    // if the URI doesn't exist, exception gets thrown here...
    using (HttpWebResponse httpResponse = (HttpWebResponse)httpRequest.GetResponse())
    {
        using (Stream responseStream = httpResponse.GetResponseStream())
        {
            using (FileStream localFileStream = 
                new FileStream(Path.Combine(localFolder, filename), FileMode.Create))
            {
                var buffer = new byte[4096];
                long totalBytesRead = 0;
                int bytesRead;

                while ((bytesRead = responseStream.Read(buffer, 0, buffer.Length)) > 0)
                {
                    totalBytesRead += bytesRead;
                    localFileStream.Write(buffer, 0, bytesRead);
                }
            }
        }
    }
}
catch (Exception ex)
{        
    throw;
}
于 2016-05-24T12:00:31.030 回答
0

分配大于预期文件大小的缓冲区大小。

字节[] byteBuffer = 新字节[65536];

因此,如果文件大小为 1GiB,则分配 1 GiB 缓冲区,然后尝试一次调用填充整个缓冲区。这种填充可能会返回更少的字节,但您仍然分配了整个缓冲区。请注意,.NET 中单个数组的最大长度是 32 位数字,这意味着即使您将程序重新编译为 64 位并且实际上有足够的可用内存。

于 2018-04-13T09:38:05.223 回答
0

对我来说,您关于如何通过缓冲读取文件的方法看起来很奇怪。也许问题是,你这样做

while(bytesRead > 0)

如果由于某种原因,流在某个时刻没有返回任何字节但仍未完成下载,那么它将退出循环并且永远不会回来。您应该获得 Content-Length,并通过 bytesRead 增加变量 totalBytesReceived。最后,您将循环更改为

while(totalBytesReceived < ContentLength)
于 2018-02-02T09:36:21.590 回答
0

您应该更改超时设置。似乎有两个可能的超时问题:

  • 客户端超时 - 尝试更改 WebClient 中的超时。我发现有时我需要下载大文件。
  • 服务器端超时 - 尝试更改服务器上的超时。您可以使用另一个客户端(例如 PostMan)验证这是问题所在
于 2018-01-01T08:45:32.323 回答