3

我一直在尝试requests-html在 venv 环境中使用(python 3.7.0 - MacOS 10.15.1),但是我正在处理一些证书问题(我不在任何代理/防火墙后面):

主要调用是:

from requests_html import HTMLSession
sessao = HTMLSession()
r1 = sessao.get(url=url_inicio)

运行 GET 方法时引发异常,如下所示:

/Users/ricardobarroslourenco/PycharmProjects/zarc/venv/bin/python "/Users/ricardobarroslourenco/Library/Application Support/JetBrains/Toolbox/apps/PyCharm-P/ch-0/192.6817.19/PyCharm.app/Contents/helpers/pydev/pydevd.py" --multiproc --qt-support=auto --client 127.0.0.1 --port 50377 --file /Users/ricardobarroslourenco/PycharmProjects/zarc/zarc_scraper/main.py
pydev debugger: process 9369 is connecting

Connected to pydev debugger (build 192.6817.19)
[W:pyppeteer.chromium_downloader] start chromium download.
Download may take a few minutes.
Traceback (most recent call last):
  File "/Users/ricardobarroslourenco/PycharmProjects/zarc/venv/lib/python3.7/site-packages/urllib3/connectionpool.py", line 672, in urlopen
    chunked=chunked,
  File "/Users/ricardobarroslourenco/PycharmProjects/zarc/venv/lib/python3.7/site-packages/urllib3/connectionpool.py", line 376, in _make_request
    self._validate_conn(conn)
  File "/Users/ricardobarroslourenco/PycharmProjects/zarc/venv/lib/python3.7/site-packages/urllib3/connectionpool.py", line 994, in _validate_conn
    conn.connect()
  File "/Users/ricardobarroslourenco/PycharmProjects/zarc/venv/lib/python3.7/site-packages/urllib3/connection.py", line 394, in connect
    ssl_context=context,
  File "/Users/ricardobarroslourenco/PycharmProjects/zarc/venv/lib/python3.7/site-packages/urllib3/util/ssl_.py", line 370, in ssl_wrap_socket
    return context.wrap_socket(sock, server_hostname=server_hostname)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/ssl.py", line 412, in wrap_socket
    session=session
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/ssl.py", line 850, in _create
    self.do_handshake()
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/ssl.py", line 1108, in do_handshake
    self._sslobj.do_handshake()
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1045)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/ricardobarroslourenco/PycharmProjects/zarc/venv/lib/python3.7/site-packages/requests_html.py", line 714, in browser
    self._browser = await pyppeteer.launch(ignoreHTTPSErrors=not(self.verify), headless=True, args=self.__browser_args)
  File "/Users/ricardobarroslourenco/PycharmProjects/zarc/venv/lib/python3.7/site-packages/pyppeteer/launcher.py", line 311, in launch
    return await Launcher(options, **kwargs).launch()
  File "/Users/ricardobarroslourenco/PycharmProjects/zarc/venv/lib/python3.7/site-packages/pyppeteer/launcher.py", line 125, in __init__
    download_chromium()
  File "/Users/ricardobarroslourenco/PycharmProjects/zarc/venv/lib/python3.7/site-packages/pyppeteer/chromium_downloader.py", line 136, in download_chromium
    extract_zip(download_zip(get_url()), DOWNLOADS_FOLDER / REVISION)
  File "/Users/ricardobarroslourenco/PycharmProjects/zarc/venv/lib/python3.7/site-packages/pyppeteer/chromium_downloader.py", line 78, in download_zip
    data = http.request('GET', url, preload_content=False)
  File "/Users/ricardobarroslourenco/PycharmProjects/zarc/venv/lib/python3.7/site-packages/urllib3/request.py", line 76, in request
    method, url, fields=fields, headers=headers, **urlopen_kw
  File "/Users/ricardobarroslourenco/PycharmProjects/zarc/venv/lib/python3.7/site-packages/urllib3/request.py", line 97, in request_encode_url
    return self.urlopen(method, url, **extra_kw)
  File "/Users/ricardobarroslourenco/PycharmProjects/zarc/venv/lib/python3.7/site-packages/urllib3/poolmanager.py", line 330, in urlopen
    response = conn.urlopen(method, u.request_uri, **kw)
  File "/Users/ricardobarroslourenco/PycharmProjects/zarc/venv/lib/python3.7/site-packages/urllib3/connectionpool.py", line 760, in urlopen
    **response_kw
  File "/Users/ricardobarroslourenco/PycharmProjects/zarc/venv/lib/python3.7/site-packages/urllib3/connectionpool.py", line 760, in urlopen
    **response_kw
  File "/Users/ricardobarroslourenco/PycharmProjects/zarc/venv/lib/python3.7/site-packages/urllib3/connectionpool.py", line 760, in urlopen
    **response_kw
  File "/Users/ricardobarroslourenco/PycharmProjects/zarc/venv/lib/python3.7/site-packages/urllib3/connectionpool.py", line 720, in urlopen
    method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]
  File "/Users/ricardobarroslourenco/PycharmProjects/zarc/venv/lib/python3.7/site-packages/urllib3/util/retry.py", line 436, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='storage.googleapis.com', port=443): Max retries exceeded with url: /chromium-browser-snapshots/Mac/575458/chrome-mac.zip (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1045)')))

有关如何解决此问题的任何提示?这个想法是抓取一些使用 javascript 生成 cookie 的网站,requests-html据说可以解决渲染问题(发生在常规requests包上)。

4

0 回答 0