3

I'm receiving multiple loadFinished signals when I attempt to load a QWebPage and I'm not sure what's causing the issue. There were a couple of other questions that seemed to allude to the same problem, but the solutions didn't work for me:

In the first question, the answer was to connect signals to slots only once," but I already do that. The answer to the second question suggests that I should connect to the frame's loadFinished signal, but I simply don't get the necessary data when that is done.

I attempt to load multiple pages:

int main(int argc, char *argv[])
{
    QApplication app(argc, argv);    

    QList<QUrl> urls;
    urls.append(QUrl("http://www.useragentstring.com/pages/Chrome/"));
    urls.append(QUrl("http://www.useragentstring.com/pages/Firefox/"));
    urls.append(QUrl("http://www.useragentstring.com/pages/Opera/"));
    urls.append(QUrl("http://www.useragentstring.com/pages/Internet Explorer/"));
    urls.append(QUrl("http://www.useragentstring.com/pages/Safari/"));

    foreach(QUrl url, urls)
    {
        UA* ua = new UA();
        QWebPage* page = new QWebPage();
        //QObject::connect(page, SIGNAL(loadFinished(bool)), ua, SLOT(pageLoadFinished(bool)));
        QObject::connect(page->mainFrame(), SIGNAL(loadFinished(bool)), ua, SLOT(frameLoadFinished(bool)));
        // Load the page
        page->mainFrame()->load(url);
    }

    return app.exec();
}

The class that processes the signals looks like this:

class UA:public QObject
{
    Q_OBJECT
private:
    int _numPageLoadSignals;
    int _numFrameLoadSignals
public:
    UA()
    {
        _numPageLoadSignals = 0;
        _numFrameLoadSignals = 0;
    }
    ~UA(){}
public slots:
    void pageLoadFinished(bool ok)
    {
        _numPageLoadSignals++;

        QWebPage * page = qobject_cast<QWebPage *>(sender());
        if(ok && page)
        {    
            qDebug() << _numPageLoadSignals << " loads " 
                << page->mainFrame()->documentElement().findAll("div#liste ul li a").count()
                << " elements found on: " << page->mainFrame()->requestedUrl().toString();
        }
    }

    void frameLoadFinished(bool ok)
    {
        _numFrameLoadSignals++;
        QWebFrame * frame = qobject_cast<QWebFrame *>(sender());
        if(ok && frame)
        {
            qDebug() << _numFrameLoadSignals << " loads " 
                <<  frame->documentElement().findAll("div#liste ul li a").count()
                << " elements found on: " << frame->requestedUrl().toString();
        }
    }
};

Here is the result of only connecting to the frame's loadFinished signal:

1  loads  0  elements found on:  "http://www.useragentstring.com/pages/Safari/"
1  loads  0  elements found on:  "http://www.useragentstring.com/pages/Chrome/"
1  loads  0  elements found on:  "http://www.useragentstring.com/pages/Opera/"
1  loads  0  elements found on:  "http://www.useragentstring.com/pages/Firefox/"
1  loads  241  elements found on:  "http://www.useragentstring.com/pages/Internet Explorer/"

Here are the results when I connect to the page's loadFinished signal:

1  loads  0  elements found on:  "http://www.useragentstring.com/pages/Safari/"
1  loads  0  elements found on:  "http://www.useragentstring.com/pages/Chrome/"
1  loads  0  elements found on:  "http://www.useragentstring.com/pages/Firefox/"
1  loads  0  elements found on:  "http://www.useragentstring.com/pages/Internet Explorer/"
2  loads  576  elements found on:  "http://www.useragentstring.com/pages/Safari/"
2  loads  782  elements found on:  "http://www.useragentstring.com/pages/Chrome/"
2  loads  241  elements found on:  "http://www.useragentstring.com/pages/Internet Explorer/"
2  loads  1946  elements found on:  "http://www.useragentstring.com/pages/Firefox/"
3  loads  241  elements found on:  "http://www.useragentstring.com/pages/Internet Explorer/"
3  loads  1946  elements found on:  "http://www.useragentstring.com/pages/Firefox/"
3  loads  782  elements found on:  "http://www.useragentstring.com/pages/Chrome/"
1  loads  964  elements found on:  "http://www.useragentstring.com/pages/Opera/"
3  loads  576  elements found on:  "http://www.useragentstring.com/pages/Safari/"

I don't understand the behavior, why sometimes I get relevant content and other times I don't. If I connect to the page's loadFinished signal, then I will eventually get the content but I don't know when it will actually happen. How do I know when my page has actually finished loading?

Update

I'm assuming that most of my content will arrive in less than 3 seconds, so I've come up with a workaround: I set a timer event to signal the UA::loadFinished 3 seconds after the first loadFinished signal is received from the QWebPage. That's not very pretty, nor is it efficient, but it works for this situation.

4

2 回答 2

1

引用 QWebPage 文档:

最后,当页面内容完全加载时,会发出 loadFinished() 信号,与脚本执行或页面渲染无关。

关键是最后一句话。因此,以下线程中的一些人指出了我认为的问题。

为什么 QWebView.loadFinished 在某些网站(例如 youtube)上会被多次调用?

我一直在努力编写一个爬虫,该爬虫涉及在幕后使用 javascript 加载内容的页面。多个 loadFinished 是一个问题(我希望它在一切都解决后触发。),但我注意到基本问题是即使在最后一个 loadFinished 激活一个插槽之后,网页内容仍可能无法呈现/准备。

因此,我对 QWebPage 类的许多信号进行了试验,以查看它们中的任何一个是否在 loadFinished 信号之后始终被触发。

找到一个:repaintRequested(QRect)

我不知道这是否一直有效。但是,如果任何内容影响了网页的外观,我相信必须调用此信号才能假定页面完整。我既没有显示页面,也没有使用视图小部件,但信号始终被触发。唯一的问题是它被触发了很多次。(比loadFinished多得多),因此你需要检查mainFrame->requestedUrl()是否与mainFrame->url()相同,并且你感兴趣的内容的关键字是否存在。(特别是如果您像我一样重用 webPage。随后的请求更改了 requestedUrl,而之前加载的 mainFrame 内容仍然存在。那里有一些持久性)

减少要检查的信号数量的技巧可能是仅在从 QWebPage 接收到 loadFinished 信号(并可能检查额外条件)后才连接 repaintRequested。

这可能无法解决无限嵌套加载,因为不知道是否有任何信号是最后一个,但是如果您正在搜索内容,那么在加载该特定内容后必然会触发一个信号(我的意思是集成到 DOM :)

于 2015-02-22T16:06:42.547 回答
0

我解决了这个问题,指定死对象的内存缓存容量,换句话说,我只是使用以下方法禁用 QtWebKit 内存缓存:

QWebSettings::setObjectCacheCapacities(0, 0, 0);

要了解更多信息,请点击此处的链接

http://qt-project.org/doc/qt-4.8/qwebsettings.html#setObjectCacheCapacities

于 2014-11-27T16:08:59.257 回答