4

我正在使用 puppeteer 来做一些轻量级的爬行 ~2K 页面。但我不断看到这个错误再次发生

  File "/env/local/lib/python3.7/site-packages/pyppeteer/execution_context.py", line 106, in evaluateHandle
    'userGesture': True,
pyppeteer.errors.NetworkError: Protocol error (Runtime.callFunctionOn): Cannot find context with specified id

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
...
  File "/user_code/main.py", line 434, in main_program
    crawl_data = asyncio.get_event_loop().run_until_complete(crawl(browser, url))
  File "/opt/python3.7/lib/python3.7/asyncio/base_events.py", line 573, in run_until_complete
    return future.result()
  File "/user_code/main.py", line 394, in crawl
    title = await page.title()
  File "/env/local/lib/python3.7/site-packages/pyppeteer/page.py", line 1437, in title
    return await frame.title()
  File "/env/local/lib/python3.7/site-packages/pyppeteer/frame_manager.py", line 752, in title
    return await self.evaluate('() => document.title')
  File "/env/local/lib/python3.7/site-packages/pyppeteer/frame_manager.py", line 295, in evaluate
    pageFunction, *args, force_expr=force_expr)
  File "/env/local/lib/python3.7/site-packages/pyppeteer/execution_context.py", line 55, in evaluate
    pageFunction, *args, force_expr=force_expr)
  File "/env/local/lib/python3.7/site-packages/pyppeteer/execution_context.py", line 109, in evaluateHandle
    _rewriteError(e)
  File "/env/local/lib/python3.7/site-packages/pyppeteer/execution_context.py", line 238, in _rewriteError
    raise type(error)(msg)
pyppeteer.errors.NetworkError: Execution context was destroyed, most likely because of a navigation.
"  

我不明白它是如何触发相关的错误的,frame.title()因为在我的代码中,它只查找实际的页面标题,而不是在其框架内。

此外,它在导航到任何框架内容之前调用页面标题:

    try:
        # max timeout of 8 seconds
        response = await page.goto(
            url,
            {'timeout': 12000}
        )
        if response.status != 200:
            await page.close()
            return(False)
    except TimeoutError:
        return(False)
    except Exception as e:
        print(e)
        return(False)

    # had this in before, but it was causing too many timeouts.  Error still persists
    #await page.waitForNavigation();

    try:
        source_code = await page.content()
    except:
        return(False)

    # title
    title = await page.title()
    title = title[:1000]

    # get all the frames    
    frames = page.frames
    content = ""
    for frame in frames:
        content_new = await frame.content();
        content += content_new

    await page.close()

这种反复出现的错误的可能原因是什么?

4

0 回答 0