0

我想使用 .net 运行大约 10.000 个并发请求HttpWebRequest,并非所有请求都进入同一主机,其中一些通过代理池。

我目前使用的线程最多可以处理 1000 个并发请求(在任务管理器中大约等于 3%),但是当我扩展到 2000 个甚至 5000 个并发请求时,我会遇到很多异常(见下文)和 100% 的 cpu 负载。
当我注意到我认为它对服务器来说太多了,但是运行例如 2 个具有 1000 个并发的实例也可以工作,所以我猜测它与连接管理有关。

首先:请求的示例代码(猜测一切都很好,直到大规模)

public string HttpGet(string url)
{
    try {
        HttpWebRequest request = (HttpWebRequest)HttpWebRequest.Create(url);
        request.Timeout = 20000;
        request.CookieContainer = Cookie;
        request.AutomaticDecompression = DecompressionMethods.GZip;
        request.KeepAlive = true;
        request.Method = "GET";
        HttpWebResponse response = (HttpWebResponse)request.GetResponse();

        Stream dataStream = response.GetResponseStream();
        StreamReader reader = new StreamReader(dataStream);
        string tmp = reader.ReadToEnd();
        reader.Close();
        dataStream.Close();
        response.Close();

        return tmp;
    } catch (Exception ex) {
        return "";
    }
}

是的,当然我(认为我)也设置了 ServicePointManager 正确:

ServicePointManager.DefaultConnectionLimit = 20 * 1000;
ServicePointManager.MaxServicePointIdleTime = 1000 * 60 * 20;//maybe this is an issue?
ServicePointManager.UseNagleAlgorithm = false;
ServicePointManager.Expect100Continue = false;

奇怪的是,即使应用程序甚至没有使用连接限制的一半,我也会超时(没有真正的超时,它的超时就像当您的 DefaultConnectionLimit 较低并尝试使请求并行时发生的那样)。

所以我决定尝试使用 ConnectionGroupName 给每个线程一个唯一的连接:

request.ConnectionGroupName = randomStringPerThreadBasis;

这至少增加了打开的 TCP 连接,所以我不仅扩展了 .NET DefaultConnectionLimit,还增加了 Windows 下可用的动态 TCP 端口,例如http://kb.globalscape.com/KnowledgebaseArticle10438.aspx和这个netsh int ipv4 set dynamicport tcp start=1025 num=50000

奇怪的事情是该应用程序现在打开了如此多的连接,这不是必需的,但仍然有很高的 cpu 镜头并且仍然很少超时(但比以前好多了,所以它可能与连接管理有关?)。


为了完整性:我有时也会遇到这些异常(但不经常):

System.Net.WebException: The underlying connection was closed: A connection that was expected to be kept alive was closed by the server. ---> System.IO.IOException:Unable to read data from the transport connection : An existing connection was forcibly closed by the remote host. ---> A connection attempt failed because the connected party did not properly
respond after a period of time, or established connection failed because 
connected host has failed to respond
   at System.Net.Sockets.NetworkStream.Read(Byte[] buffer, Int32 offset, Int32 size)
   at System.Net.Sockets.NetworkStream.Read(Byte[] buffer, Int32 offset, Int32 size)
   at System.Net.FixedSizeReader.ReadPacket(Byte[] buffer, Int32 offset, Int32 count)
   at System.Net.Security._SslStream.StartFrameHeader(Byte[] buffer, Int32 offset, Int32 count, AsyncProtocolRequest asyncRequest)
   at System.Net.Security._SslStream.StartReading(Byte[] buffer, Int32 offset, Int32 count, AsyncProtocolRequest asyncRequest)
   at System.Net.Security._SslStream.ProcessRead(Byte[] buffer, Int32 offset, Int32 count, AsyncProtocolRequest asyncRequest)
   at System.Net.TlsStream.Read(Byte[] buffer, Int32 offset, Int32 size)
   at System.Net.PooledStream.Read(Byte[] buffer, Int32 offset, Int32 size)
   at System.Net.Connection.SyncRead(HttpWebRequest request, Boolean userRetrievedStream, Boolean probeRead)
   at System.Net.HttpWebRequest.GetResponse()

所以我的问题是我该如何解决这个问题?我做错了吗?有解决方法吗?我可以自己管理连接吗?是否有另一种适用于这种情况的类/语言?等等

4

0 回答 0