0

我正在尝试使用 html 敏捷包从 html 表中获取数据,但只获取第一个表行中的数据。

我正在阅读的 html 代码如下:

<div id="mainDiv">
    <table id="tbl">
        <thead>
            <tr>
                <th class="tbl_col1">UserName</th>
                <th class="tbl_col2">Points</th>
            </tr>
        </thead>
        <tbody>     
          <tr data-source="provider1">
            <td class="tbl_col1">
                <a href="/Users/1090" id="UserLink" target="_blank">UserName1</a>           
            </td>
            <td class="tbl_col2">
                <a href="/UserPoints/1090" id="PointLink" target="_blank">1892 <span class="up_arrow">&nbsp;</span></a>             
            </td>           
          </tr>
          <tr data-source="provider2">
            <td class="tbl_col1">
                <a href="/Users/1090" id="UserLink" target="_blank">UserName2</a>           
            </td>
            <td class="tbl_col2">
                <a href="/UserPoints/1090" id="PointLink" target="_blank">3217 <span class="down_arrow">&nbsp;</span></a>               
            </td>           
         </tr>
        </tbody>
    </table>
</div>  

我正在使用此代码

var UserTable = htmlDocument.DocumentNode.SelectSingleNode("//div[@id='mainDiv']").SelectSingleNode("//table[@id='tbl']").SelectSingleNode("//tbody").SelectNodes("//tr");
foreach (var row in UserTable)
{
    if (row.Attributes["data-source"] != null)
    {
        string Source = row.Attributes["data-source"].Value;
        string UserName = row.SelectSingleNode("td[@class='tbl_col1']").SelectSingleNode("//a[@id='UserLink']/text()").InnerText;
        string Points = row.SelectSingleNode("td[@class='tbl_col2']").SelectSingleNode("//a[@id='PointLink']/text()").InnerText;
        Console.WriteLine(Source + "\t" + UserName + "\t" + Points);
    }
}

但我不断得到这个输出:

provider1       UserName1       1892
provider2       UserName1       1892
4

1 回答 1

2

您做出了错误的假设://a[@id='UserLink']/text()//a[@id='PointLink']/text()在整个文档中进行了搜索。这就是你获得第一个tr节点的原因。只需使用:

string UserName = row.SelectSingleNode("td[@class='tbl_col1']/a[@id='UserLink']/text()").InnerText;
string Points = row.SelectSingleNode("td[@class='tbl_col2']/a[@id='PointLink']/text()").InnerText;

您还可以真正简化其余代码:

var UserTable = doc.DocumentNode.SelectNodes("//div[@id='mainDiv']/table[@id='tbl']/tbody/tr");
foreach (var row in UserTable)
{
    if (row.Attributes["data-source"] != null)
    {
        string Source = row.Attributes["data-source"].Value;
        string UserName = row.SelectSingleNode("td[@class='tbl_col1']/a[@id='UserLink']/text()").InnerText;
        string Points = row.SelectSingleNode("td[@class='tbl_col2']/a[@id='PointLink']/text()").InnerText;
        Console.WriteLine(Source + "\t" + UserName + "\t" + Points);
    }
}
于 2013-03-15T20:26:10.293 回答