0
from odps import ODPS
from odps import options
import csv
import os
from datetime import timedelta, datetime 

options.sql.use_odps2_extension = True
options.tunnel.use_instance_tunnel = True
options.connect_timeout = 60
options.read_timeout=130
options.retry_times = 7
options.chunk_size = 8192*2

odps = ODPS('id','secret','project', endpoint ='endpointUrl')
table = odps.get_table('eventTable')

def uploadFile(file):
    with table.open_writer(partition=None) as writer:
        with open(file, 'rt') as csvfile:
                rows = csv.reader(csvfile, delimiter='~')
                for final in rows:
                    writer.write(final)
    writer.close();

uploadFile('xyz.csv')       

假设我从目录中一一传递uploadFile调用中的文件数从python连接阿里巴巴云以将数据迁移到云上的最大计算表中。当我运行此代码时,服务会在长时间工作后或在夜间停止。它在 writer.write(final) 行报告错误读取超时错误。

错误:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/urllib3/response.py", line 226, in _error_catcher
    yield
  File "/usr/lib/python3/dist-packages/urllib3/response.py", line 301, in read
    data = self._fp.read(amt)
  File "/usr/lib/python3.5/http/client.py", line 448, in read
    n = self.readinto(b)
  File "/usr/lib/python3.5/http/client.py", line 488, in readinto
    n = self.fp.readinto(b)
  File "/usr/lib/python3.5/socket.py", line 575, in readinto
    return self._sock.recv_into(b)
socket.timeout: timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/requests/models.py", line 660, in generate
    for chunk in self.raw.stream(chunk_size, decode_content=True):
  File "/usr/lib/python3/dist-packages/urllib3/response.py", line 344, in stream
    data = self.read(amt=amt, decode_content=decode_content)
  File "/usr/lib/python3/dist-packages/urllib3/response.py", line 311, in read
    flush_decoder = True
  File "/usr/lib/python3.5/contextlib.py", line 77, in __exit__
    self.gen.throw(type, value, traceback)
  File "/usr/lib/python3/dist-packages/urllib3/response.py", line 231, in _error_catcher
    raise ReadTimeoutError(self._pool, None, 'Read timed out.')
requests.packages.urllib3.exceptions.ReadTimeoutError: HTTPConnectionPool(host='dt.odps.aliyun.com', port=80): Read timed out.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/dataUploader.py", line 66, in <module>
    uploadFile('xyz.csv')       
  File "/dataUploader.py", line 53, in uploadFile
    writer.write(final)
  File "/usr/local/lib/python3.5/dist-packages/odps/models/table.py", line 643, in __exit__
    self.close()
  File "/usr/local/lib/python3.5/dist-packages/odps/models/table.py", line 631, in close
    upload_session.commit(written_blocks)
  File "/usr/local/lib/python3.5/dist-packages/odps/tunnel/tabletunnel.py", line 308, in commit
    in self.get_block_list()])
  File "/usr/local/lib/python3.5/dist-packages/odps/tunnel/tabletunnel.py", line 298, in get_block_list
    self.reload()
  File "/usr/local/lib/python3.5/dist-packages/odps/tunnel/tabletunnel.py", line 238, in reload
    resp = self._client.get(url, params=params, headers=headers)
  File "/usr/local/lib/python3.5/dist-packages/odps/rest.py", line 138, in get
    return self.request(url, 'get', stream=stream, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/odps/rest.py", line 125, in request
    proxies=self._proxy)
  File "/usr/lib/python3/dist-packages/requests/sessions.py", line 608, in send
    r.content
  File "/usr/lib/python3/dist-packages/requests/models.py", line 737, in content
    self._content = bytes().join(self.iter_content(CONTENT_CHUNK_SIZE)) or bytes()
  File "/usr/lib/python3/dist-packages/requests/models.py", line 667, in generate
    raise ConnectionError(e)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='dt.odps.aliyun.com', port=80): Read timed out.

packet_write_wait: Connection to 122.121.122.121 port 22: Broken pipe

这是我得到的错误。你能提出什么问题吗?

4

2 回答 2

1

读取超时是等待读取数据的超时时间。通常,如果服务器在最后一个字节之后的几秒内未能发送一个字节,则会引发读取超时错误。

发生这种情况是因为服务器无法在指定的超时期限内读取文件。

在这里,读取超时设置为 130 秒,如果您的文件非常大,则该值会更短。

请将超时限制从 130 秒增加到 500 秒, options.read_timeout=130options.read_timeout=500

它将解决您的问题,同时将重试时间从 7 减少到 3,options.retry_times=7options.retry_times=3

于 2018-10-18T08:58:33.110 回答
0

此错误通常是由网络问题引起的。

curl endpoint URL在终端中执行。如果它立即返回,如下所示:

<?xml version="1.0" encoding="UTF-8"?>
<Error>
    <Code>NoSuchObject</Code>
    <Message><![CDATA[Unknown http request location: /]]></Message>
    <RequestId>5E5CC9526283FEC94F19DAAE</RequestId>
    <HostId>localhost</HostId>
</Error>

然后可以访问端点 URL。但如果它挂起,那么您应该检查您是否使用了正确的端点 URL。

由于 MaxCompute (ODPS) 具有公共和私有端点,因此有时可能会令人困惑。

于 2020-03-02T08:57:27.940 回答