0

So i've been trying to get a program working where I get info from google finance regarding different stock stats. So far I have not been able to get information out of spans. As of now I have hardcoded direct access to the apple stock. Link to Apple stock: https://www.google.com/finance?q=NASDAQ%3AAAPL&ei=NgItWIG1GIftsAHCn4zIAg

What i can't understand is that I receive correct output when I trying it in the chrome console with the following command:

$x("//*[@id=\"appbar\"]//div//div//div//span");

This is my current code in Visual studio 2015 with Html Agility Pack installed(I suspect a fault in currDocNodeCompanyName):

class StockDataAccess
{
    HtmlWeb web= new HtmlWeb();
    private List<string> testList;

    public void FindStock()
    {
        var histDoc = web.Load("https://www.google.com/finance/historical?q=NASDAQ%3AAAPL&ei=q9IsWNm4KZXjsAG-4I7oCA.html");
        var histDocNode = histDoc.DocumentNode.SelectNodes("//*[@id=\"prices\"]//table//tr//td");

        var currDoc = web.Load("https://www.google.com/finance?q=NASDAQ%3AAAPL&ei=CdcsWMjNCIe0swGd3oaYBA.html");
        var currDocNodeCurrency = currDoc.DocumentNode.SelectNodes("//*[@id=\"ref_22144_elt\"]//div//div");
        var currDocNodeCompanyName = currDoc.DocumentNode.SelectNodes("//*[@id=\"appbar\"]//div//div//div//span");

        var histDocText = histDocNode.Select(node => node.InnerText);
        var currDocCurrencyText = currDocNodeCurrency.Select(node => node.InnerText);
        var currDocCompanyName = currDocNodeCompanyName.Select(node => node.InnerText);

        List<String> result = new List<string>(histDocText.Take(6));
        result.Add(currDocCurrencyText.First());
        result.Add(currDocCompanyName.Take(2).ToString());
        testList = result;
    }

    public List<String> ReturnStock()
    {
        return testList;
    }
}

I have been trying the Xpath expression [text] and received an output that i can work with when using the chrome console but not in VS. I have also been experimenting with a foreach-loop, a few suggested it to others.

class StockDataAccess
{
    HtmlWeb web= new HtmlWeb();
    private List<string> testList;

    public void FindStock()
    {
        ///same as before

        var currDoc = web.Load("https://www.google.com/finance?q=NASDAQ%3AAAPL&ei=CdcsWMjNCIe0swGd3oaYBA.html");
        HtmlNodeCollection currDocNodeCompanyName = currDoc.DocumentNode.SelectNodes("//*[@id=\"appbar\"]//div//div//div//span");

        ///Same as before

        List <string> blaList = new List<string>();
        foreach (HtmlNode x in currDocNodeCompanyName)
        {
            blaList.Add(x.InnerText);
        }

        List<String> result = new List<string>(histDocText.Take(6));
        result.Add(currDocCurrencyText.First());
        result.Add(blaList[1]);
        result.Add(blaList[2]);

        testList = result;
    }

    public List<String> ReturnStock()
    {
        return testList;
    }
}

I would really appreciate if anyone could point me in the right direction.

4

1 回答 1

0

如果您检查 currDoc.DocumentNode.InnerHtml 的内容,您会注意到没有 id 为“appbar”的元素,因此结果是正确的,因为 xpath 不返回任何内容。

我怀疑您要查找的 html 元素是由脚本(例如 js)生成的,这就解释了为什么您可以在浏览器上而不是在 HtmlDocument 对象上看到它,因为 HtmlAgilityPack 不呈现脚本,它只下载并解析原始源代码。

于 2016-11-18T09:28:52.773 回答