所以我试图将以下数据解析为 CSV。从我的阅读来看,最好的方法是使用 HAP,因为它有一个强大的解析器。
截至目前,WPF WebBrowser 控件内容正在通过以下方式访问:
dynamic doc = this.wbControl.Document;
内容
<div class="content">
<fieldset>
<ul class="fieldsetr">
<li class="row medium">
<div class="field">
<div class="shell">
<em class="disable">Sender:</em>
</div>
</div>
<div>
<div class="clip">
<em>me@example.com</em>
</div>
</div>
</li>
<li class="row medium alt">
<div class="field">
<div class="shell">
<em class="disable">Recipient:</em>
</div>
</div>
<div>
<div class="clip">
<em>me2@example2.com</em>
</div>
</div>
</li>
<li class="row medium">
<div class="field">
<div class="shell">
<em class="disable">Message ID:</em>
</div>
</div>
<div>
<div class="clip">
<em>2342342345235</em>
</div>
</div>
</li>
<li class="row medium alt">
<div class="field">
<div class="shell">
<em class="disable">Message size:</em>
</div>
</div>
<div>
<div class="clip">
<em>18.74 KB
</em>
</div>
</div>
</li>
<li class="row medium">
<div class="field">
<div class="shell">
<em class="disable">Date and time received:</em>
</div>
</div>
<div>
<div class="clip">
<em>11/27/2012 6:17:22 AM</em>
</div>
</div>
</li>
<li class="row medium alt">
<div class="field">
<div class="shell">
<em class="disable">Date and time filtered:</em>
</div>
</div>
<div>
<div class="clip">
<em>11/27/2012 6:17:22 AM</em>
</div>
</div>
</li>
<li class="row medium">
<!-- Connector Details -->
</li>
<li class="row medium alt">
<div class="field">
<div class="shell">
<em class="disable">First delivery attempt:</em>
</div>
</div>
<div>
<div class="clip">
<em>11/27/2012 6:17:23 AM</em>
</div>
</div>
</li>
<li class="row medium">
<div class="field">
<div class="shell">
<em class="disable">Final delivery attempt:</em>
</div>
</div>
<div>
<div class="clip">
<em>11/27/2012 6:17:23 AM</em>
</div>
</div>
</li>
<li class="row medium alt">
<div class="field">
<div class="shell">
<em class="disable">From IP address:</em>
</div>
</div>
<div>
<div class="clip">
<em>1.2.3.4 <unknown></em>
</div>
</div>
</li>
<li class="row medium">
<div class="field">
<div class="shell">
<em class="disable">To IP address:</em>
</div>
</div>
<div>
<div class="clip">
<em>4.3.2.1 <mail.example2.com> </em>
</div>
</div>
</li>
<li class="row medium alt">
<div class="field">
<div class="shell">
<em class="disable">Filtering results:</em>
</div>
</div>
<div>
<div class="clip">
<em>Passed Filtering</em>
</div>
</div>
</li>
<li class="row medium">
<div class="field">
<div class="shell">
<em class="disable">Delivery result:</em>
</div>
</div>
<div>
<div class="clip">
<span><em>Delivered: 470 2.4.0 <2342342345235> [InternalId=2321233] Queued mail for delivery</em></span>
</div>
</div>
</li>
</ul>
</fieldset>
</div>
我转换这些数据的最佳方式是什么?这只是一条记录,但会添加更多记录。
编辑
最终使用以下代码进行测试:
HtmlAgilityPack.HtmlDocument docHAP = new HtmlAgilityPack.HtmlDocument();
docHAP.LoadHtml(doc.Body.InnerHtml.ToString());
foreach(HtmlNode emNode in docHAP.DocumentNode.SelectNodes("//em"))
{
MessageBox.Show(emNode.InnerText.ToString());
}
如果有人有更有效的解决方案,请随时告诉我。