1

如何使用 C#获取div类或更多的内容?in

我有以下 HTML 代码:

<!DOCTYPE html>
<html lang="en" xmlns="http://www.w3.org/1999/xhtml">
<head>
    <meta charset="utf-8" />
    <title></title>
</head>
<body>
    <div id="xxx">
        <div class="in">
            <a href="/a/show/7184569" class="mm">ВАЗ 2121</a> <span class="for">за</span>
            <span class="price">2 700 $</span>
            <br />
            <span class="year">1990 г.</span><br />
            <div style="margin: 3px 0 3px 0">contentxxx</div>
        </div>
    </div>
</body>
</html>

我想获取内容,div class="in"结果是:

<div class="in">
     <a href="/a/show/7184569" class="mm">ВАЗ 2121</a> <span class="for">за</span>
     <span class="price">2 700 $</span>
     <br />
     <span class="year">1990 г.</span><br />
     <div style="margin: 3px 0 3px 0">contentxxx</div>
</div>
4

3 回答 3

2
using HtmlAgilityPack;

static void Parse
        {


            HtmlWeb web = new HtmlWeb();
            HtmlDocument doc = new HtmlDocument();
            doc.LoadHtml(getHTML());

            HtmlNodeCollection nodeCol = doc.DocumentNode.SelectNodes("//div[@class=\"in\"]");

            string value = nodeCol[0].InnerHtml;
        }

        static string getHTML()
        {
            string retVal = "";

            retVal = @"<!DOCTYPE html>"
                     + "<html lang=\"en\" xmlns=\"http://www.w3.org/1999/xhtml\">"
                    + "<head>"
                        + "<meta charset=\"utf-8\" />"
                        + "<title></title>"
                    + "</head>"
                    + "<body>"
                        + "<div id=\"xxx\">"
                            + "<div class=\"in\">"
                                + "<a href=\"/a/show/7184569\" class=\"mm\">ВАЗ 2121</a> <span class=\"for\">за</span>"
                                + "<span class=\"price\">2 700 $</span>"
                                + "<br />"
                                + "<span class=\"year\">1990 г.</span><br />"
                                + "<div style=\"margin: 3px 0 3px 0\">contentxxx</div>"

                            + "</div>"
                        + "</div>"
                    + "</body>"
                    + "</html>";

            return retVal;
        }

请添加命名空间 HtmlAgilityPack;参考:http ://htmlagilitypack.codeplex.com/releases/view/90925

于 2012-09-15T07:23:14.890 回答
0

您可以使用HTML Agility Pack轻松完成:

using HtmlAgilityPack;

...
var doc = new HtmlDocument();
doc.Load(@"C:\file.htm") //see the overloads. You can also use `LoadHtml` method.

var node = doc.DocumentNode.SelecSingleNode("//div[@class='in']");

//This is the text you are looking for...
var result = node.OuterHtml;
于 2012-09-19T03:47:20.573 回答
-2

使用 JQuery 获取 div 的内容:

<script language="text/javascript">

       var d = $('div.in').html();
</script>

上面的代码获取具有in类的 div 的内容。

于 2012-09-19T05:01:43.567 回答