2

我正在尝试使用 bulkloader 功能将大小适中的 csv 文件上传到谷歌应用程序引擎,它似乎在某些方面死了,结果如下:

[INFO    ] Logging to bulkloader-log-20110328.181531
[INFO    ] Throttling transfers:
[INFO    ] Bandwidth: 250000 bytes/second
[INFO    ] HTTP connections: 8/second
[INFO    ] Entities inserted/fetched/modified: 20/second
[INFO    ] Batch Size: 10
[INFO    ] Opening database: bulkloader-progress-20110328.181531.sql3
[INFO    ] Connecting to notmyrealappname.appspot.com/_ah/remote_api
[INFO    ] Starting import; maximum 10 entities per post
...............................................................[INFO    ] Unexpected thread death: WorkerThread-7
[INFO    ] An error occurred. Shutting down...
.........[ERROR   ] Error in WorkerThread-7: <urlopen error [Errno -2] Name or service not known>

[INFO    ] 1740 entites total, 0 previously transferred
[INFO    ] 720 entities (472133 bytes) transferred in 32.3 seconds
[INFO    ] Some entities not successfully transferred

它上传了我尝试上传的 19k 条目中的大约 700 个,我想知道它为什么会失败。我检查了 csv 文件中是否存在错误,例如可能会导致 python csv 阅读器无法使用的额外逗号,并且非 ascii 字符已被删除。

4

1 回答 1

6

提升批量限制 (batch_size) 和 rps 限制 (rps_limit) 有效,我使用 1000 作为批量大小和 rps 限制 500:

appcfg.py upload_data --url= --application= --filename=  --email= --batch_size=1000 --rps_limit=500
于 2011-11-02T02:05:39.403 回答