我只是想了解 HTMLAgilityPack 和 XPath,我正在尝试从 NASDAQ 网站获取(HTML 链接)公司列表;
http://www.nasdaq.com/quotes/nasdaq-100-stocks.aspx
我目前有以下代码;
HtmlAgilityPack.HtmlDocument htmlDoc = new HtmlAgilityPack.HtmlDocument();
// Create a request for the URL.
WebRequest request = WebRequest.Create("http://www.nasdaq.com/quotes/nasdaq-100-stocks.aspx");
// Get the response.
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
// Get the stream containing content returned by the server.
Stream dataStream = response.GetResponseStream();
// Open the stream using a StreamReader for easy access.
StreamReader reader = new StreamReader(dataStream);
// Read the content.
string responseFromServer = reader.ReadToEnd();
// Read into a HTML store read for HAP
htmlDoc.LoadHtml(responseFromServer);
HtmlNodeCollection tl = htmlDoc.DocumentNode.SelectNodes("//*[@id='indu_table']/tbody/tr[*]/td/b/a");
foreach (HtmlAgilityPack.HtmlNode node in tl)
{
Debug.Write(node.InnerText);
}
// Cleanup the streams and the response.
reader.Close();
dataStream.Close();
response.Close();
我使用 Chrome 的 XPath 插件来获取 XPath;
//*table[@id='indu_table']/tbody/tr[*]/td/b/a
运行我的项目时,我收到一个关于它是无效令牌的 xpath 未处理异常。
我有点不确定它有什么问题,我试图在上面的 tr[*] 部分输入一个数字,但我仍然得到同样的错误。
我最近一个小时一直在看这个,这很简单吗?
谢谢