1

我正在尝试使用机械化从 web_page 保存图像。我使用这段代码:



    @current_agent.get( image_url ).save ( save_path )

错误(我认为我的超时设置有问题):



    I, [2013-03-25T14:42:13.924694 #31865]  INFO -- : Net::HTTP::Get: /i?path=b0312211141_img_id282431557272802821.jpg
    D, [2013-03-25T14:42:13.924757 #31865] DEBUG -- : request-header: accept => */*
    D, [2013-03-25T14:42:13.924828 #31865] DEBUG -- : request-header: user-agent => Mozilla/5.0 (iPad; CPU OS 5_1_1 like Mac OS X) AppleWebKit/534.46 (KHTML, like Gecko) Version/5.1 Mobile/9B206 Safari/7534.48.3
    D, [2013-03-25T14:42:13.924858 #31865] DEBUG -- : request-header: accept-encoding => gzip,deflate,identity
    D, [2013-03-25T14:42:13.924884 #31865] DEBUG -- : request-header: accept-charset => ISO-8859-1,utf-8;q=0.7,*;q=0.7
    D, [2013-03-25T14:42:13.924915 #31865] DEBUG -- : request-header: accept-language => en-us,en;q=0.5
    D, [2013-03-25T14:42:13.924942 #31865] DEBUG -- : request-header: host => mdata.yandex.net
    I, [2013-03-25T14:42:14.151810 #31865]  INFO -- : status: Net::HTTPOK 1.0 200 OK
    D, [2013-03-25T14:42:14.151890 #31865] DEBUG -- : response-header: server => nginx/1.2.1
    D, [2013-03-25T14:42:14.151919 #31865] DEBUG -- : response-header: date => Mon, 25 Mar 2013 13:43:54 GMT
    D, [2013-03-25T14:42:14.151943 #31865] DEBUG -- : response-header: content-type => image/jpeg
    D, [2013-03-25T14:42:14.151967 #31865] DEBUG -- : response-header: content-length => 212187
    D, [2013-03-25T14:42:14.151991 #31865] DEBUG -- : response-header: last-modified => Tue, 12 Mar 2013 18:11:41 GMT
    D, [2013-03-25T14:42:14.152015 #31865] DEBUG -- : response-header: expires => Wed, 24 Apr 2013 13:43:54 GMT
    D, [2013-03-25T14:42:14.152039 #31865] DEBUG -- : response-header: cache-control => max-age=2592000
    D, [2013-03-25T14:42:14.152062 #31865] DEBUG -- : response-header: x-original-host => mdata.somesite.ru
    D, [2013-03-25T14:42:14.152086 #31865] DEBUG -- : response-header: accept-ranges => bytes
    D, [2013-03-25T14:42:14.152109 #31865] DEBUG -- : response-header: x-cache => MISS from parser.myapp.com.ua
    D, [2013-03-25T14:42:14.152133 #31865] DEBUG -- : response-header: x-cache-lookup => MISS from parser.notus.com.ua:1221
    D, [2013-03-25T14:42:14.152157 #31865] DEBUG -- : response-header: via => 1.0 parser.myapp.com.ua (squid/3.1.10)
    D, [2013-03-25T14:42:14.152180 #31865] DEBUG -- : response-header: connection => keep-alive
    D, [2013-03-25T14:42:14.152464 #31865] DEBUG -- : Read 2521 bytes (2521 total)
    D, [2013-03-25T14:42:14.152509 #31865] DEBUG -- : Read 598 bytes (3119 total)
    D, [2013-03-25T14:42:14.199787 #31865] DEBUG -- : Read 1448 bytes (6613 total)
    D, [2013-03-25T14:42:14.199887 #31865] DEBUG -- : Read 2648 bytes (9261 total)
    D, [2013-03-25T14:42:14.200125 #31865] DEBUG -- : Read 2896 bytes (12157 total)
    D, [2013-03-25T14:42:14.200286 #31865] DEBUG -- : Read 1200 bytes (13357 total)
    D, [2013-03-25T14:42:14.248204 #31865] DEBUG -- : Read 2896 bytes (16253 total)
    D, [2013-03-25T14:42:14.248436 #31865] DEBUG -- : Read 1200 bytes (17453 total)
    D, [2013-03-25T14:42:14.248510 #31865] DEBUG -- : Read 1448 bytes (18901 total)
    D, [2013-03-25T14:42:14.248609 #31865] DEBUG -- : Read 2648 bytes (21549 total)
    D, [2013-03-25T14:42:14.248864 #31865] DEBUG -- : Read 2896 bytes (24445 total)
    D, [2013-03-25T14:42:14.248985 #31865] DEBUG -- : Read 1200 bytes (25645 total)
    D, [2013-03-25T14:42:14.249174 #31865] DEBUG -- : Read 1448 bytes (27093 total)
    D, [2013-03-25T14:42:14.249354 #31865] DEBUG -- : Read 2648 bytes (29741 total)
    D, [2013-03-25T14:42:14.296443 #31865] DEBUG -- : Read 1448 bytes (31189 total)
    D, [2013-03-25T14:42:14.296583 #31865] DEBUG -- : Read 2648 bytes (33837 total)
    D, [2013-03-25T14:42:14.296756 #31865] DEBUG -- : Read 1448 bytes (35285 total)

我看到它开始抓取图像,“读取 2896 个字节(总共 12157 个)”并在之后冻结!!!解析图像不完整,图像未保存:(

我怎么解决这个问题?

4

1 回答 1

1

您可以通过以下方式设置代理的超时(读取和连接):

@current_agent.open_timeout = 10 # in seconds
@current_agent.read_timeout = 10 # in seconds
于 2013-03-25T14:31:33.203 回答