3

我希望标题足够清楚,但我会尽力解释......

我正在使用 C# Winforms ( dotnet 4.5 )。

问题是我正在创建一个 WebBrowser 控件并尝试使用wb.DocumentText. 但是当我尝试遍历元素时,它说文档是空的(null)

这是我的代码:

WebBrowser wb = new WebBrowser();
wb.DocumentText = leMessage;

HtmlElementCollection elems = wb.Document.GetElementsByTagName("a");
foreach (HtmlElement elem in elems)
{
    // Do Some Stuff
}

leMessage包含一个 HTML 时事通讯消息,其中有一些标签。

我已经尝试过这个:wb.Document.Body.InnerHtml = leMessage;但这也不起作用......

我错过了什么或做错了什么?

4

3 回答 3

6

WebBrowser.DocumentText是异步的。您需要先处理好DocumentComplete,然后才能访问 DOM,并不断泵送 Windows 消息。这是一个完整的网络抓取示例,async/await用于保持方便的线性代码流。只需更改导航部分:

await NavigateAsync(ct, () => this.webBrowser.DocumentText = leMessage), timeout);
HtmlElementCollection elems = wb.Document.GetElementsByTagName("a");

这样你就可以循环进行。简而言之:

using System;
using System.Diagnostics;
using System.Threading.Tasks;
using System.Windows.Forms;

namespace WinformsApp2
{
    public partial class MainForm : Form
    {
        public MainForm()
        {
            InitializeComponent();
        }

        const string leMessage = "<a href='http://example.com'>Go there</a>";

        private async void MainForm_Load(object sender, EventArgs e)
        {
            var wb = new WebBrowser();

            TaskCompletionSource<bool> tcs = null;
            WebBrowserDocumentCompletedEventHandler documentCompletedHandler = (sender2, e2) => tcs.TrySetResult(true);

            for (int i = 0; i < 3; i++)
            {
                tcs = new TaskCompletionSource<bool>();
                wb.DocumentCompleted += documentCompletedHandler;
                try {
                    wb.DocumentText = leMessage;
                    await tcs.Task;
                }
                finally {
                    wb.DocumentCompleted -= documentCompletedHandler;
                }
                HtmlElementCollection elems = wb.Document.GetElementsByTagName("a");
                foreach (HtmlElement elem in elems)
                {
                    Debug.Print(elem.OuterHtml);
                }
            }
        }
    }
}
于 2013-10-29T08:50:25.670 回答
4

您需要在事件webBrowser1_DocumentCompleted触发后循环元素。因此您需要在代码中包含它

webBrowser1.DocumentCompleted+=new WebBrowserDocumentCompletedEventHandler(webBrowser1_DocumentCompleted);

private void webBrowser1_DocumentCompleted(object sender,WebBrowserDocumentCompletedEventArgs e)
{
   //here you can to loop your elements     
}
于 2013-10-29T08:39:15.367 回答
1

试试这个:

WebBrowser wb;
private void Form1_Load(object sender, EventArgs e)
{
    wb = new WebBrowser();
    wb.DocumentCompleted += wb_DocumentCompleted;
    wb.DocumentText = "<html><body><a href='#'>Test</a></body></html>";
}

void wb_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
    HtmlElementCollection elems = ((WebBrowser)sender)
        .Document.GetElementsByTagName("a");
    foreach (HtmlElement elem in elems)
    {
        // Do Some Stuff
    }
}
于 2013-10-29T08:41:38.560 回答