我正在使用以下代码向网站发出 HttpWebRequests:
public static HttpWebResponse SendGETRequest(string url, string agent, CookieContainer cookieContainer)
{
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url);
request.UserAgent = agent;
request.Method = "GET";
request.ContentType = "text/html";
request.CookieContainer = cookieContainer;
return (HttpWebResponse)request.GetResponse();
}
在我尝试使用一个新网页并且只收到页面的最后一部分之前,一切都可以在几个网页上正常工作。这是收到的响应:
<tr>
<td colspan="2" height="5"><spacer type="block" width="100%" height="5"></td>
</tr>
</table>
</td>
</tr>
</table>
</body>
</html>
标头是正确的,并表示只发送接收到的数据。以下是请求和响应的标头:
要求:
GET /Broker/Ops/FichaContratoJS.asp?nc=815044&IP=5&YY=2012&M=6&St=0&CC=FESX201206 HTTP/1.1
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/535.19 (KHTML, like Gecko) Chrome/18.0.1025.168 Safari/535.19
Content-Type: text/html
Host: www.xxxx.com
Cookie: ASPSESSIONIDACBDCDBT=MGDNMNABOANDMILHBNCIDFCH;Autenticacion=Sid=230fae3d%2De0e2%2D4df1%2D8aa8%2D000fb352eaef&IdUsuarioWeb=xxxx; ASPSESSIONIDACBCCDAT=AFDJMNABAFJDDHABLOLAINDK; ASPSESSIONIDCADCBCAT=CEBJGNABLCALPJLDJFPBMLDE
回复:
HTTP/1.1 200 OK
Date: Wed, 09 May 2012 07:25:03 GMT
Server: Microsoft-IIS/6.0
X-Powered-By: ASP.NET
Pragma: no-cache
**Content-Length: 155**
Content-Type: text/html
Expires: Wed, 09 May 2012 07:24:02 GMT
Set-Cookie: Autenticacion=Sid=230fae3d%2De0e2%2D4df1%2D8aa8%2D000fb352eaef&IdUsuarioWeb=xxxx; path=/
Cache-control: no-cache
对 Web 浏览器执行相同操作可以正常工作,并返回大约 4000 字节的内容长度。
有任何想法吗?
PD:为了以防万一,我从不同线程到同一个站点对 SendGETRequest 进行了多次调用,但由于没有共享变量,我认为它不应该有所作为。
编辑:这是我用来从流中提取文本的扩展:
public static string ReadTextResponse(this Stream stream)
{
int count;
Encoding enconding = System.Text.Encoding.GetEncoding(1252);
System.Text.StringBuilder stringBuilder = new StringBuilder();
byte[] buffer = new byte[1023];
do
{
count = stream.Read(buffer, 0, buffer.Length);
if (count != 0)
{
string tempString = enconding.GetString(buffer, 0, count);
stringBuilder.Append(tempString);
}
}
while (count > 0);
return stringBuilder.ToString();
}
据我所知这是正确的。另外,请注意来自服务器的响应标头包含截断数据的长度