3

我正在尝试浏览网站并使用 Windows 窗体中的 WebBrowser 控件以编程方式在页面上做一些工作。我在寻找一种方法来阻止我的线程直到 WebBrowser 的 DocumentCompleted 事件被触发时发现了这一点。鉴于此,这是我当前的代码:

public partial class Form1 : Form
{
    private AutoResetEvent autoResetEvent;

    public Form1()
    {
        InitializeComponent();
    }

    private void button1_Click(object sender, EventArgs e)
    {
        Thread workerThread = new Thread(new ThreadStart(this.DoWork));
        workerThread.SetApartmentState(ApartmentState.STA);
        workerThread.Start();
    }

    private void DoWork()
    {
        WebBrowser browser = new WebBrowser();
        browser.DocumentCompleted += new WebBrowserDocumentCompletedEventHandler(browser_DocumentCompleted);
        browser.Navigate(login_page);
        autoResetEvent.WaitOne();
        // log in

        browser.Navigate(page_to_process);
        autoResetEvent.WaitOne();
        // process the page
    }

    private void browser_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
    {
        autoResetEvent.Set();
    }
}

该线程看起来没有必要,但是当我扩展此代码以通过网络接受请求时(线程将侦听连接,然后处理请求)。此外,我不能只将处理代码放在 DocumentCompleted 处理程序中,因为我必须导航到几个不同的页面并在每个页面上执行不同的操作。

现在,据我了解,这不起作用的原因是因为 DocumentCompleted 事件使用了调用 WaitOne() 的同一线程,因此在 WaitOne() 返回之前不会触发该事件(在这种情况下从不)。

有趣的是,如果我从工具箱(拖放)将 WebBrowser 控件添加到表单中,然后使用它进行导航,则此代码可以完美运行(除了将 Navigate 调用放在 Invoke 调用中之外,没有任何更改 -见下文)。但是,如果我手动将 WebBrowser 控件添加到 Designer 文件,它就不起作用。而且我真的不想在我的表单上看到一个可见的 WebBrowser,我只想报告结果。

public delegate void NavigateDelegate(string address);
browser.Invoke(new NavigateDelegate(this.browser.Navigate), new string[] { login_page });

那么,我的问题是:在浏览器的 DocumentCompleted 事件触发之前暂停线程的最佳方法是什么?

4

3 回答 3

1

克里斯,

我在这里向您传递了一个解决问题的可能实现,但请看一下这里的评论,我必须在一切按预期工作之前面对和修复。这是在 webBrowser 中的页面上执行某些活动的方法示例(请注意,在我的情况下,webBrowser 是 Form 的一部分):

    internal ActionResponse CheckMessages() //Action Response is a custom class of mine to store some data coming from pages
        {
        //go to messages
        HtmlDocument doc = WbLink.Document; //wbLink is a referring link to a webBrowser istance
        HtmlElement ele = doc.GetElementById("message_alert_box");
        if (ele == null)
            return new ActionResponse(false);

        object obj = ele.DomElement;
        System.Reflection.MethodInfo mi = obj.GetType().GetMethod("click");
        mi.Invoke(obj, new object[0]);

        semaphoreForDocCompletedEvent = WaitForDocumentCompleted();  //This is a simil-waitOne statement (1)
        if (!semaphoreForDocCompletedEvent)
            throw new Exception("sequencing of Document Completed events is failed.");

        //get the list
        doc = WbLink.Document;
        ele = doc.GetElementById("mailz");
        if (!ele.WaitForAvailability("mailz", Program.BrowsingSystem.Document, 10000)) //This is a simil-waitOne statement (2)

            ele = doc.GetElementById("mailz");
        ele = doc.GetElementById("mailz");

        //this contains a tbody
        HtmlElement tbody = ele.FirstChild;

        //count how many elemetns are espionage reports, these elements are inline then counting double with their wrappers on top of them.
        int spioCases = 0;
        foreach (HtmlElement trs in tbody.Children)
        {
            if (trs.GetAttribute("id").ToLower().Contains("spio"))
                spioCases++;
        }

        int nMessages = tbody.Children.Count - 2 - spioCases;

        //create an array of messages to store data
        GameMessage[] archive = new GameMessage[nMessages];

        for (int counterOfOpenMessages = 0; counterOfOpenMessages < nMessages; counterOfOpenMessages++)
        {

            //open first element
            WbLink.ScriptErrorsSuppressed = true;
            ele = doc.GetElementById("mailz");
            //this contains a tbody
            tbody = ele.FirstChild;

            HtmlElement mess1 = tbody.Children[1];
            int idMess1 = int.Parse(mess1.GetAttribute("id").Substring(0, mess1.GetAttribute("id").Length - 2));
            //check if subsequent element is not a spio report, in case it is then the element has not to be opened.
            HtmlElement mess1Sibling = mess1.NextSibling;
            if (mess1Sibling.GetAttribute("id").ToLower().Contains("spio"))
            {
                //this is a wrapper for spio report
                ReadSpioEntry(archive, counterOfOpenMessages, mess1, mess1Sibling);
                //delete first in line
                DeleteFirstMessageItem(doc, ref ele, ref obj, ref mi, ref tbody);
                semaphoreForDocCompletedEvent = WaitForDocumentCompleted(6); //This is a simil-waitOne statement (3)

            }
            else
            {
                //It' s anormal message
                OpenMessageEntry(ref obj, ref mi, tbody, idMess1); //This opens a modal dialog over the page, and it is not generating a DocumentCompleted Event in the webBrowser

                //actually opening a message generates 2 documetn completed events without any navigating event issued
                //Application.DoEvents();
                semaphoreForDocCompletedEvent = WaitForDocumentCompleted(6);

                //read element
                ReadMessageEntry(archive, counterOfOpenMessages);

                //close current message
                CloseMessageEntry(ref ele, ref obj, ref mi);  //this closes a modal dialog therefore is not generating a documentCompleted after!
                semaphoreForDocCompletedEvent = WaitForDocumentCompleted(6);
                //delete first in line
                DeleteFirstMessageItem(doc, ref ele, ref obj, ref mi, ref tbody); //this closes a modal dialog therefore is not generating a documentCompleted after!
                semaphoreForDocCompletedEvent = WaitForDocumentCompleted(6);
            }
        }
        return new ActionResponse(true, archive);
    }

在实践中,此方法获取 MMORPG 的页面并读取其他玩家发送到帐户的消息,并通过方法 ReadMessageEntry 将它们存储在 ActionResponse 类中。

除了真正依赖于案例的代码的实现和逻辑(并且对您没有用)之外,还有一些有趣的元素可能会为您的案例提供很好的注意。我在代码中添加了一些注释并突出显示了 3 个重要点 [带符号(1),(2)(3)]

算法是:

1)到达一个页面

2) 从 webBrowser 获取底层 Document

3)找到一个元素点击进入消息页面[完成:HtmlElement ele = doc.GetElementById("message_alert_box");]

4)通过 MethodInfo 实例和反射调用触发单击它的事件[这会调用另一个页面,因此 DocumentCompleted 迟早会到达]

5) 等待调用完成的文档,然后继续[完成方式:semaphoreForDocCompletedEvent = WaitForDocumentCompleted(); 在点 (1)]

6) 页面更改后从 webBrowser 获取新文档

7)在页面上找到一个特定的锚点,该锚点定义了我要阅读的消息的位置

8)确保页面中存在这样的标签(因为可能有一些 AJAX 延迟我想要阅读的内容准备就绪)[完成:ele.WaitForAvailability("mailz", Program.BrowsingSystem.Document, 10000)即第 (2) 点]

9) 为读取每条消息执行整个循环,这意味着打开一个位于同一页面上的模式对话框表单,因此不会生成 DocumentCompleted,准备好时读取它,然后关闭它,然后重新循环。对于这种特殊情况,我使用semaphoreForDocCompletedEvent = WaitForDocumentCompleted(6);在点 (3)调用的 (1) 重载

现在我用来暂停、检查和阅读的三种方法:

(1)在引发 DocumentCompleted 时停止而不会对 DocumentCompleted 方法过度收费,该方法可用于多个单一目的(如您的情况)

private bool WaitForDocumentCompleted()
        {
            Thread.SpinWait(1000);  //This is dirty but working
            while (Program.BrowsingSystem.IsBusy) //BrowsingSystem is another link to Browser that is made public in my Form and IsBusy is just a bool put to TRUE when Navigating event is raised and but to False when the DocumentCOmpleted is fired.
            {
                Application.DoEvents();
                Thread.SpinWait(1000);
            }

            if (Program.BrowsingSystem.IsInfoAvailable)  //IsInfoAvailable is just a get property to cover webBroweser.Document inside a lock statement to protect from concurrent accesses.
            {
                return true;
            }
            else
                return false;
        }

(2) 等待页面中的特定标签可用:

public static bool WaitForAvailability(this HtmlElement tag, string id, HtmlDocument documentToExtractFrom, long maxCycles)
        {
            bool cond = true;
            long counter = 0;
            while (cond)
            {
                Application.DoEvents(); //VERIFY trovare un modo per rimuovere questa porcheria
                tag = documentToExtractFrom.GetElementById(id);
                if (tag != null)
                    cond = false;
                Thread.Yield();
                Thread.SpinWait(100000);
                counter++;
                if (counter > maxCycles)
                    return false;
            }
            return true;
        }

(3) 等待 DocumentCompleted 到达的肮脏技巧,因为页面上不需要重新加载框架!

private bool WaitForDocumentCompleted(int seconds)
    {
        int counter = 0;
        while (Program.BrowsingSystem.IsBusy)
        {
            Application.DoEvents();
            Thread.Sleep(1000);
            if (counter == seconds)
            {
            return true;
            }
            counter++;
        }
        return true;
    }

我还将 DocumentCompleted Methods 和 Navigating 传递给您,让您全面了解我是如何使用它们的。

private void webBrowser_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
        {
            if (Program.BrowsingSystem.BrowserLink.ReadyState == WebBrowserReadyState.Complete)
            {
                lock (Program.BrowsingSystem.BrowserLocker)
                {
                    Program.BrowsingSystem.ActualPosition = Program.BrowsingSystem.UpdatePosition(Program.BrowsingSystem.Document);
                    Program.BrowsingSystem.CheckContentAvailability();
                    Program.BrowsingSystem.IsBusy = false;
                }
            }
        }

private void webBrowser_Navigating(object sender, WebBrowserNavigatingEventArgs e)
        {
            lock (Program.BrowsingSystem.BrowserLocker)
            {
                Program.BrowsingSystem.ActualPosition.PageName = OgamePages.OnChange;
                Program.BrowsingSystem.IsBusy = true;
            }
        }

如果您现在了解此处介绍的实现背后的细节,请查看此处以了解 DoEvents() 背后的混乱(希望从 S.Overflow 链接其他站点不是问题)。

最后一点,当您从 Form 实例使用 Navigate 方法时,您需要将调用放在 Invoke 中:这很清楚您需要 Invoke,因为需要在 webBrowser 上工作的方法(或即使将其作为参考变量纳入范围)也需要在 webBrowser 本身的同一线程上启动!

此外,如果 WB 是某种 Form 容器的子容器,它还需要实例化它的线程与 Form 创建的线程相同,并且为了传递性,需要调用 WB 上需要工作的所有方法在 Form 线程上(在您的情况下,调用将您的调用重新定位在 Form 本机线程上)。我希望这对你有用(我只是在我的母语代码中留下了 //VERIFY 注释,让你知道我对 Application.DoEvents() 的看法)。

亲切的问候,亚历克斯

于 2012-07-06T19:42:07.450 回答
0

HAH! I had the same question. You can do this with event handling. If you stop a thread mid way through the page, it will need to wait until it the page finishes. You can easily do this by attaching

 Page.LoadComplete += new EventHandler(triggerFunction);

In the triggerFunction you can do this

triggerFunction(object sender, EventArgs e)
{
     autoResetEvent.reset();
}

Let me know if this works. I ended up not using threads in mine and instead just putting the stuff into triggerFunction. Some syntax might not be 100% correct because I am answering off the top of my head

于 2012-07-03T19:04:23.833 回答
0

编辑

像这样在初始化组件方法中注册,而不是在相同的方法中。

WebBrowser browser = new WebBrowser(); 
WebBrowserDocumentCompletedEventHandler(webBrowser_DocumentCompleted);

ReadyState 将在 DocumentCompleted 事件中检查时告诉您文档加载的进度。

void webBrowser_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
   if (browser.ReadyState == WebBrowserReadyState.Complete)
{

}
}
于 2012-07-03T19:07:24.380 回答