python - 使用python下载时如何优雅地超时

Question

我正在循环下载大量文件，其中包含以下代码：

try:
    urllib.urlretrieve(url2download, destination_on_local_filesystem)
except KeyboardInterrupt:
    break
except:
    print "Timed-out or got some other exception: "+url2download

如果连接刚开始时服务器在 URL url2download 上超时，则正确处理最后一个异常。但有时服务器响应，并开始下载，但服务器太慢了，即使一个文件也需要几个小时，最终它返回如下内容：

Enter username for Clients Only at albrightandomalley.com:
Enter password for  in Clients Only at albrightandomalley.com:

并且只是挂在那里（尽管如果通过浏览器下载相同的链接，则不会询问用户名/密码）。

在这种情况下，我的意图是——跳过这个文件并转到下一个文件。问题是——如何做到这一点？python中有没有办法指定下载一个文件可以工作多长时间，如果已经花费了更多时间，请中断并继续？

score 8 · Accepted Answer

8

尝试：

import socket

socket.setdefaulttimeout(30)

于 2014-10-20T02:36:10.557 回答

score 4 · Accepted Answer

如果您不限于开箱即用的 python 附带的内容，那么urlgrabber模块可能会派上用场：

import urlgrabber
urlgrabber.urlgrab(url2download, destination_on_local_filesystem,
                   timeout=30.0)

score 3 · Accepted Answer

这里有一个讨论。警告（除了他们提到的）：我没有尝试过，他们正在使用urllib2，而不是urllib（这对你来说会是个问题吗？）（实际上，现在我考虑一下，这种技术可能会起作用也为urllib）。

score 2 · Accepted Answer

这个问题对于函数超时更普遍： How to limit execution time of a function call in Python

我已经使用我的答案中描述的方法编写了一个等待文本功能，该功能超时以尝试自动登录。如果您想要类似的功能，可以在此处参考代码：

http://code.google.com/p/psftplib/source/browse/trunk/psftplib.py

python - 使用python下载时如何优雅地超时

4 回答 4

Related

Reference