我有一个如下所示的 html 文件
<div>
<div style="margin-left:0.5em;">
<div class="tiny" style="margin-bottom:0.5em;">
<b><span class="h3color tiny">This review is from: </span>You Meet</b>
</div>
If you know Ron Kaufman as I do ...
<br /><br />Whether you're the CEO....
<br /><br />Written in a distinctive, ...
<br /><br />My advice? Don't just get one copy
<div style="padding-top: 10px; clear: both; width: 100%;"></div>
</div>
<div style="margin-left:0.5em;">
<div class="tiny" style="margin-bottom:0.5em;">
<b><span class="h3color tiny">This review is from: </span>My Review</b>
</div>
I became a fan of Ron Kaufman after reading an earlier book of his years ago...
<div style="padding-top: 10px; clear: both; width: 100%;"></div>
</div>
</div>
我想获得没有任何 html 标签的评论文本。我现在使用下面的代码
foreach (HtmlNode divReview in doc.DocumentNode.SelectNodes(@"//div[@style='margin-left:0.5em;']"))
{
if (divReview != null)
{
review.Add(divReview.Descendants("div").Where(d => d.Attributes.Contains("style") &&
d.Attributes["style"].Value.Contains("padding-top: 10px; clear: both; width: 100%;")).
Select(d =>
d.PreviousSibling.InnerText.Trim()).SingleOrDefault());
}
}
只返回“我的建议?不要只得到一份”,我怎样才能得到整个文本?
更新:即使我删除所有
“兄弟”
来自 htmlnode 的标签,仍然在使用上面的代码时,我只得到“我的建议?不要只得到一份”部分!任何意见?