我想使用 .net 运行大约 10.000 个并发请求HttpWebRequest
,并非所有请求都进入同一主机,其中一些通过代理池。
我目前使用的线程最多可以处理 1000 个并发请求(在任务管理器中大约等于 3%),但是当我扩展到 2000 个甚至 5000 个并发请求时,我会遇到很多异常(见下文)和 100% 的 cpu 负载。
当我注意到我认为它对服务器来说太多了,但是运行例如 2 个具有 1000 个并发的实例也可以工作,所以我猜测它与连接管理有关。
首先:请求的示例代码(猜测一切都很好,直到大规模)
public string HttpGet(string url)
{
try {
HttpWebRequest request = (HttpWebRequest)HttpWebRequest.Create(url);
request.Timeout = 20000;
request.CookieContainer = Cookie;
request.AutomaticDecompression = DecompressionMethods.GZip;
request.KeepAlive = true;
request.Method = "GET";
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
Stream dataStream = response.GetResponseStream();
StreamReader reader = new StreamReader(dataStream);
string tmp = reader.ReadToEnd();
reader.Close();
dataStream.Close();
response.Close();
return tmp;
} catch (Exception ex) {
return "";
}
}
是的,当然我(认为我)也设置了 ServicePointManager 正确:
ServicePointManager.DefaultConnectionLimit = 20 * 1000;
ServicePointManager.MaxServicePointIdleTime = 1000 * 60 * 20;//maybe this is an issue?
ServicePointManager.UseNagleAlgorithm = false;
ServicePointManager.Expect100Continue = false;
奇怪的是,即使应用程序甚至没有使用连接限制的一半,我也会超时(没有真正的超时,它的超时就像当您的 DefaultConnectionLimit 较低并尝试使请求并行时发生的那样)。
所以我决定尝试使用 ConnectionGroupName 给每个线程一个唯一的连接:
request.ConnectionGroupName = randomStringPerThreadBasis;
这至少增加了打开的 TCP 连接,所以我不仅扩展了 .NET DefaultConnectionLimit
,还增加了 Windows 下可用的动态 TCP 端口,例如http://kb.globalscape.com/KnowledgebaseArticle10438.aspx和这个netsh int ipv4 set dynamicport tcp start=1025 num=50000
奇怪的事情是该应用程序现在打开了如此多的连接,这不是必需的,但仍然有很高的 cpu 镜头并且仍然很少超时(但比以前好多了,所以它可能与连接管理有关?)。
为了完整性:我有时也会遇到这些异常(但不经常):
System.Net.WebException: The underlying connection was closed: A connection that was expected to be kept alive was closed by the server. ---> System.IO.IOException:Unable to read data from the transport connection : An existing connection was forcibly closed by the remote host. ---> A connection attempt failed because the connected party did not properly
respond after a period of time, or established connection failed because
connected host has failed to respond
at System.Net.Sockets.NetworkStream.Read(Byte[] buffer, Int32 offset, Int32 size)
at System.Net.Sockets.NetworkStream.Read(Byte[] buffer, Int32 offset, Int32 size)
at System.Net.FixedSizeReader.ReadPacket(Byte[] buffer, Int32 offset, Int32 count)
at System.Net.Security._SslStream.StartFrameHeader(Byte[] buffer, Int32 offset, Int32 count, AsyncProtocolRequest asyncRequest)
at System.Net.Security._SslStream.StartReading(Byte[] buffer, Int32 offset, Int32 count, AsyncProtocolRequest asyncRequest)
at System.Net.Security._SslStream.ProcessRead(Byte[] buffer, Int32 offset, Int32 count, AsyncProtocolRequest asyncRequest)
at System.Net.TlsStream.Read(Byte[] buffer, Int32 offset, Int32 size)
at System.Net.PooledStream.Read(Byte[] buffer, Int32 offset, Int32 size)
at System.Net.Connection.SyncRead(HttpWebRequest request, Boolean userRetrievedStream, Boolean probeRead)
at System.Net.HttpWebRequest.GetResponse()
所以我的问题是我该如何解决这个问题?我做错了吗?有解决方法吗?我可以自己管理连接吗?是否有另一种适用于这种情况的类/语言?等等