0

我正在尝试提取网页的某些部分,但遇到了一些麻烦。我对网络解析很陌生,所以请假设我一无所知并保持答案非常详细。

我有这段 html

<div id="playerStats">
  <div id="hp"><span class="title">HP:</span>"12213"</div>
  <div id="mp"><span class="title">MP:</span></div>
  <div id="magicResist"><span class="title">Magic Resist</span>"4618"</div>
  <div id="physicalDefend"><span class="title">Physical Defence</span>"1725"</div>
  <div id="phyCriticalReduceRate"><span class="title">Strike Resist</span>"1518"</div>
  <div id="phyCriticalDamageReduce"><span class="title">Strike fortitude</span>"392"</div>
  <div id="physicalRight"><span class="title">Main Hand Attack</span>"201"</div>
  <div id="accuracyRight"><span class="title">Main Hand Accuracy</span>"201"</div>
  <div id="criticalRight"><span class="title">Main Hand Critical</span>"201"</div>
  <div id="physicalLeft"><span class="title">Off Hand Attack</span>"201"</div>
  <div id="accuracyLeft"><span class="title">Off Hand Accuracy</span>"201"</div>
  <div id="criticalLeft"><span class="title">Off Hand Critical</span>"201"</div>
  <div id="attackSpeed"><span class="title">Attack Speed</span>"201"</div>
  <div id="magicalBoost"><span class="title">Magic Boost</span>"201"</div>
  <div id="magicalAccuracy"><span class="title">Magic Accuracy</span>"201"</div>
  <div id="magicalCriticalRight"><span class="title">Crit Spell</span>"201"</div>
  <div id="castingTimeRatio"><span class="title">Casting Speed</span>"201"</div>
  <div id="block"><span class="title">Block</span>"201"</div>
  <div id="dodge"><span class="title">Evasion</span>"201"</div>
</div>

这给出了一个输出

HP:
MP:
Magic Resist
Physical Defence
Strike Resist
Strike fortitude
Main Hand Attack
Main Hand Accuracy
Main Hand Critical
Off Hand Attack
Off Hand Accuracy
Off Hand Critical
Attack Speed
Magic Boost
Magic Accuracy
Crit Spell
Casting Speed
Block
Evasion
Movement Speed

使用代码

var browser = document.DocumentNode.SelectNodes("//*[@id=\"playerStats\"]");
if (browser != null) {
  foreach(var b in browser)
  output.AppendLine(b.InnerHtml);
} else {
  output.AppendLine(("Oops!  I'm broken!"));
}

但是,我还想包括数字“12213”或两者之间的任何文本

</span>"xxx"</div> 

在让我们说“HP:”之后

如何使用我已经实现的代码来检索此文本?

4

1 回答 1

0

您可以这样做(在控制台应用程序示例中):

HtmlDocument doc = new HtmlDocument();
doc.Load(MyTestFile);

foreach(var node in doc.DocumentNode.SelectNodes("//div[@id='playerStats']/div/span"))
{
    Console.WriteLine(node.InnerText + " " + (node.NextSibling != null ? node.NextSibling.InnerText : null));
}

NextSibling 是具有相同父节点的给定节点之后的下一个节点。如果当前节点是父节点的最后一个子节点,则它可能不存在。

注意我已经明确地将元素类型设置为 DIV 以进行初始选择,因为它在性能方面更好。(* 匹配任何节点)。

于 2013-05-05T06:07:32.267 回答