这是功能:
private static HtmlAgilityPack.HtmlDocument getHtmlDocumentWebClient(string url, bool useProxy, string proxyIp, int proxyPort, string usename, string password)
{
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
WebClient client = new WebClient();
//client.Headers.Add("user-agent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; .NET CLR 1.0.3705;)");
client.Credentials = CredentialCache.DefaultCredentials;
client.Proxy = WebRequest.DefaultWebProxy;
if (useProxy)
{
//Proxy
if (!string.IsNullOrEmpty(proxyIp))
{
WebProxy p = new WebProxy(proxyIp, proxyPort);
if (!string.IsNullOrEmpty(usename))
{
if (password == null)
password = string.Empty;
NetworkCredential nc = new NetworkCredential(usename, password);
p.Credentials = nc;
}
}
}
Stream data = client.OpenRead(url);
doc.Load(data);
data.Close();
return doc;
}
我在我的程序中获取每个迭代的链接,几次后变量 url 是:
http://appldnld.apple.com/iTunes10/041-7196.20120912.Ber43/iTunesSetup.exe
如果我在 InternetExplorer 中尝试此链接,它将尝试下载文件。但是在我的程序中,它试图将其加载到行中:
doc.Load(数据);
一段时间后,程序会冻结卡住,最后当我强制在任务管理器中结束应用程序时,程序会抛出异常:
StackOverFlowException was unhandled
An unhandled exception of type 'System.StackOverflowException' occurred in HtmlAgilityPack.dll
System.StackOverflowException was unhandled
Message: An unhandled exception of type 'System.StackOverflowException' occurred in HtmlAgilityPack.dll
现在我使用了一个断点,问题发生在一行:
doc.Load(data);
问题是我应该如何处理这个链接?我应该通过 try and catch 忽略它们还是应该将其视为链接?如果将来某个时候我想使用此链接下载 exe 文件,那么尝试 ctach 不是一个好主意怎么办?
编辑:
这就是 getHtmlDocumentWebClient 现在的样子:
private static HtmlAgilityPack.HtmlDocument getHtmlDocumentWebClient(string url, bool useProxy, string proxyIp, int proxyPort, string usename, string password)
{
HttpWebRequest myHttpWebRequest = null; //Declare an HTTP-specific implementation of the WebRequest class.
HttpWebResponse myHttpWebResponse = null; //Declare an HTTP-specific implementation of the WebResponse class
//Create Request
myHttpWebRequest = (HttpWebRequest)HttpWebRequest.Create(url);
myHttpWebRequest.Method = "GET";
myHttpWebRequest.ContentType = "text/html; encoding='utf-8'";
//Get Response
myHttpWebResponse = (HttpWebResponse)myHttpWebRequest.GetResponse();
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
Stream data = myHttpWebResponse.GetResponseStream();//client.OpenRead(url);
doc.Load(data);
data.Close();
return doc;
}
同样的问题。现在该功能有什么问题,我如何对文本/html内容进行实际检查?