1

我正在使用fsspec包来实现读取 https 文件的功能。

_hostname = socket.gethostname()
proxy_auth = aiohttp.BasicAuth(_hostname, pwd)
of = fsspec.filesystem("https", client_kwargs={"trust_env":True, "auth":proxy_auth})
http_urls = ["https://stackoverflow.com/"]
print(of.cat(http_urls))

上面的代码没有获取 url 的内容。上述代码的堆栈跟踪如下:

File "C:\Anaconda3\envs\mvision\lib\site-packages\fsspec\asyn.py", line 91, in wrapper
    return sync(self.loop, func, *args, **kwargs)
  File "C:\Anaconda3\envs\mvision\lib\site-packages\fsspec\asyn.py", line 71, in sync
    raise return_result
  File "C:\Anaconda3\envs\mvision\lib\site-packages\fsspec\asyn.py", line 25, in _runner
    result[0] = await coro
  File "C:\Anaconda3\envs\mvision\lib\site-packages\fsspec\asyn.py", line 347, in _cat
    raise ex
  File "C:\Anaconda3\envs\mvision\lib\site-packages\fsspec\implementations\http.py", line 230, in _cat_file
    async with session.get(url, **kw) as r:
  File "C:\Anaconda3\envs\mvision\lib\site-packages\aiohttp\client.py", line 1117, in __aenter__
    self._resp = await self._coro
  File "C:\Anaconda3\envs\mvision\lib\site-packages\aiohttp\client.py", line 521, in _request
    req, traces=traces, timeout=real_timeout
  File "C:\Anaconda3\envs\mvision\lib\site-packages\aiohttp\connector.py", line 535, in connect
    proto = await self._create_connection(req, traces, timeout)
  File "C:\Anaconda3\envs\mvision\lib\site-packages\aiohttp\connector.py", line 892, in _create_connection
    _, proto = await self._create_direct_connection(req, traces, timeout)
  File "C:\Anaconda3\envs\mvision\lib\site-packages\aiohttp\connector.py", line 1011, in _create_direct_connection
    raise ClientConnectorError(req.connection_key, exc) from exc
aiohttp.client_exceptions.ClientConnectorError: Cannot connect to host stackoverflow.com:443 ssl:default [getaddrinfo failed]

但是如果我们用缓存包装它,我就能得到内容。

_hostname = socket.gethostname()
proxy_auth = aiohttp.BasicAuth(_hostname, pwd)
of = fsspec.filesystem("simplecache", target_protocol="https", \
target_options={"client_kwargs":{"trust_env":True, "auth":proxy_auth}},
cache_storage="/tmp/files")
http_urls = ["https://stackoverflow.com/"]
print(of.cat(http_urls))

为什么第一个代码部分无法获取内容?我做错什么了吗?

4

0 回答 0