1

我正在尝试通过任务队列和 20 个后端实例将大量数据从 GCS 迁移到 AppEngine。问题是新的云存储库似乎不尊重 urlfetch 超时,或者发生了其他事情。

import cloudstorage as gcs
gcs.set_default_retry_params(gcs.RetryParams(urlfetch_timeout=60,
                                             max_retry_period=300))
...
with gcs.open(fn, 'r') as fp:
    raw_gcs_file = fp.read()

因此,当队列暂停时,以下工作正常,我一次运行一个任务,但是当我尝试针对 20 个后端运行 20 个并发任务时,以下开始发生:

I 2013-07-20 00:18:16.418 Got exception while contacting GCS. Will retry in 0.2 seconds.
I 2013-07-20 00:18:16.418 Unable to fetch URL: https://storage.googleapis.com/<removed>
I 2013-07-20 00:18:21.553 Got exception while contacting GCS. Will retry in 0.4 seconds.
I 2013-07-20 00:18:21.554 Unable to fetch URL: https://storage.googleapis.com/<removed>
I 2013-07-20 00:18:25.728 Got exception while contacting GCS. Will retry in 0.8 seconds.
I 2013-07-20 00:18:25.728 Unable to fetch URL: https://storage.googleapis.com/<removed>
I 2013-07-20 00:18:31.428 Got exception while contacting GCS. Will retry in 1.6 seconds.
I 2013-07-20 00:18:31.428 Unable to fetch URL: https://storage.googleapis.com/<removed>
I 2013-07-20 00:18:34.301 Got exception while contacting GCS. Will retry in -1 seconds.
I 2013-07-20 00:18:34.301 Unable to fetch URL: https://storage.googleapis.com/<removed>
I 2013-07-20 00:18:34.301 Urlfetch retry 5 failed after 22.8741798401 seconds total

22秒后怎么会失败?它似乎根本没有使用重试参数。

4

1 回答 1

1

这是 gcs 客户端库中的一个错误。很快就会修复。谢谢!

你黑客会工作。但是如果还是经常超时,可以尝试做fp.read(size=some_size)。如果您的文件很大,响应为 32 MB(URLfetch 响应大小限制)和 90 秒的最后期限,我们假设传输速率为 364KB/s。

于 2013-07-22T18:46:20.193 回答