-1
    <div class="logoDesc">


Gnb Road, Chandmari, Guwahati - 781003




    |
                <a href="http://www.justdial.com/Guwahati/Kiran-Mistanna-Bhandar-&lt;near&gt;-Chandmari/9999PX361-X361-1230284509G9V5B2-DC_R3V3YWhhdGkgQmFjaGVsb3IgQ2FrZQ==_BZDET/map">
                    View Map</a><br>
                <p>
                    <span class="Gray">Call: </span><span style="color: #424242; font-size: 12px;">+(91)-9954843180</span>
                    <span style="color: #424242;">|</span> <a href="http://contest.justdial.com/contest/register.php?utm_source=rsbnr&amp;utm_medium=banner&amp;cont_ref=rsbnr"
                        style="font-size: 12px; display: inline-block;" onclick="_ct('Win Ipad2','ltpg');"
                        target="_blank"><b>Win iPad2</b></a>
                </p>
                <p>
                    <span class="Gray">Also See :</span> <b>Cake Shops</b>, <a href="http://www.justdial.com/Guwahati/Bakeries/ct-10033880">
                        Bakeries</a>, <a href="http://www.justdial.com/Guwahati/Confectionery-Retailers/ct-10127628">
                            Confectionery Retailers</a>
                </p>
            </div>

我正在使用 HTML Agility 包...我只想提取地址[在星星之间] ..语法应该是什么?请帮忙。

更新:我正在使用以下代码

Protected Sub Button1_Click(ByVal sender As Object, ByVal e As System.EventArgs) Handles Button1.Click
        Dim webGet = New HtmlWeb()
        Dim document = webGet.Load("http://www.justdial.com/Guwahati/Bachelor-Cake/ct-10070075")

        Dim nodes1 = document.DocumentNode.SelectNodes("//*[@class='logoDesc']")

        For Each node In nodes1
            MsgBox(node.InnerText)
        Next node
    End Sub

使用这个代码片段,我得到了 div 内的所有细节......我只想要地址。

4

2 回答 2

0

试试这个(将“/text()”添加到 XPath 的末尾):

Protected Sub Button1_Click(ByVal sender As Object, ByVal e As System.EventArgs) Handles Button1.Click
    Dim webGet = New HtmlWeb()
    Dim document = webGet.Load("http://www.justdial.com/Guwahati/Bachelor-Cake/ct-10070075")
    Dim nodes1 = document.DocumentNode.SelectNodes("//*[@class='logoDesc']/text()")
    For Each node In nodes1
        MsgBox(node.InnerText)
    Next node
End Sub
于 2012-07-12T13:48:51.227 回答
0

不知道敏捷包,但这是一个直接向上的屏幕刮板:

    string page = Methods.GetPage("http://www.yoururl.com");
    int firstStars = page.IndexOf("***");
    string second = page.Substring(firstStars);
    int secondStars = second.IndexOf("***");

    //Add 3 to skip over the first three stars. May not need the +3, can't recall.
    string address = page.Substring(0 + 3, secondStars);


    public static string GetPage(string url)
    {
        WebClient webClient = new WebClient();
        byte[] reqHTML;
        string page = string.Empty;

        UTF8Encoding objUTF8 = new UTF8Encoding();
        try
        {
            reqHTML = webClient.DownloadData(url);
            page = objUTF8.GetString(reqHTML);
        }
        catch (Exception theex)
        {

        }
        return page;
    }
于 2012-07-11T23:02:51.817 回答