3

我是红宝石初学者,所以请耐心等待。

我正在使用 selenium-webdriver 和 rb-appscript gems 来做一些网页抓取。到网站的导航似乎是由 Net::Http 对象驱动的,它有一个 rbuf_fill 方法。

运行以下代码:

sites = File.open("sites.txt", "r") if File::exists?( "sites.txt" )

    if sites != nil
       while (line = sites.gets)

              driver.switch_to.default_content

          begin
                 driver.navigate.to line

          rescue Exception
                 line = line.split.join("\n")
                 puts line + " caused a timeout."
          end

       end

...

产生此错误:

/opt/local/lib/ruby1.9/1.9.1/net/protocol.rb:140:in `rescue in rbuf_fill': Timeout::Error (Timeout::Error)
from /opt/local/lib/ruby1.9/1.9.1/net/protocol.rb:134:in `rbuf_fill'
from /opt/local/lib/ruby1.9/1.9.1/net/protocol.rb:116:in `readuntil'
from /opt/local/lib/ruby1.9/1.9.1/net/protocol.rb:126:in `readline'
from /opt/local/lib/ruby1.9/1.9.1/net/http.rb:2219:in `read_status_line'
from /opt/local/lib/ruby1.9/1.9.1/net/http.rb:2208:in `read_new'
from /opt/local/lib/ruby1.9/1.9.1/net/http.rb:1191:in `transport_request'
from /opt/local/lib/ruby1.9/1.9.1/net/http.rb:1177:in `request'
from /opt/local/lib/ruby1.9/1.9.1/net/http.rb:1170:in `block in request'
from /opt/local/lib/ruby1.9/1.9.1/net/http.rb:627:in `start'
from /opt/local/lib/ruby1.9/1.9.1/net/http.rb:1168:in `request'
from /opt/local/lib/ruby1.9/gems/1.9.1/gems/selenium-webdriver-2.2.0/lib/selenium/webdriver/remote/http/default.rb:73:in `response_for'
from /opt/local/lib/ruby1.9/gems/1.9.1/gems/selenium-webdriver-2.2.0/lib/selenium/webdriver/remote/http/default.rb:41:in `request'
from /opt/local/lib/ruby1.9/gems/1.9.1/gems/selenium-webdriver-2.2.0/lib/selenium/webdriver/remote/http/common.rb:34:in `call'
from /opt/local/lib/ruby1.9/gems/1.9.1/gems/selenium-webdriver-2.2.0/lib/selenium/webdriver/remote/bridge.rb:406:in `raw_execute'
from /opt/local/lib/ruby1.9/gems/1.9.1/gems/selenium-webdriver-2.2.0/lib/selenium/webdriver/remote/bridge.rb:384:in `execute'
from /opt/local/lib/ruby1.9/gems/1.9.1/gems/selenium-webdriver-2.2.0/lib/selenium/webdriver/remote/bridge.rb:171:in `switchToDefaultContent'
from /opt/local/lib/ruby1.9/gems/1.9.1/gems/selenium-webdriver-2.2.0/lib/selenium/webdriver/common/target_locator.rb:68:in `default_content'
from auto.rb:25:in `<main>'

我不知道为什么我不能捕捉到这个异常。使用rescue Exception 应该可以捕获所有内容,但是您可以看到我的脚本仍然崩溃。

我还找到了说您必须明确捕获超时的消息来源,所以我也尝试了:

rescue Timeout::Error

没有任何运气。

非常感谢您对此提供的任何帮助。

Ruby 版本:ruby 1.9.2p290(2011-07-09 修订版 32553)

操作系统:MacOS雪豹10.6.8 64位

Selenium Webdriver 版本:2.2.0

4

1 回答 1

2

Ruby 标准库中的文件 'timeout.rb' 定义:

module Timeout
  # Raised by Timeout#timeout when the block times out.
  class Error < RuntimeError

所以你需要rescue的不是Timeout::Exception,而是Timeout::Error更一般地RuntimeError。然后它应该工作。

于 2011-08-01T16:13:57.037 回答