0

我试图做一些有趣的事情,比如:

http = Net::HTTP.new("t66y.com", 80)
request = Net::HTTP::Get.new("http://t66y.com/")
response = http.request(request)
puts response.inspect

它工作正常,给我<Net::HTTPOK 200 OK readbody=true>。但是,在我将 url 更改为类似的内容后http://t66y.com/thread0806.php?fid=16,它一直EOFError向我抛出异常。整个日志是:

/Users/lei/.rvm/rubies/ruby-1.9.3-p362/lib/ruby/1.9.1/net/protocol.rb:141:in `read_nonblock': end of file reached (EOFError)
    from /Users/lei/.rvm/rubies/ruby-1.9.3-p362/lib/ruby/1.9.1/net/protocol.rb:141:in `rbuf_fill'
    from /Users/lei/.rvm/rubies/ruby-1.9.3-p362/lib/ruby/1.9.1/net/protocol.rb:92:in `read'
    from /Users/lei/.rvm/rubies/ruby-1.9.3-p362/lib/ruby/1.9.1/net/http.rb:2779:in `ensure in read_chunked'
    from /Users/lei/.rvm/rubies/ruby-1.9.3-p362/lib/ruby/1.9.1/net/http.rb:2779:in `read_chunked'
    from /Users/lei/.rvm/rubies/ruby-1.9.3-p362/lib/ruby/1.9.1/net/http.rb:2750:in `read_body_0'
    from /Users/lei/.rvm/rubies/ruby-1.9.3-p362/lib/ruby/1.9.1/net/http.rb:2710:in `read_body'
    from /Users/lei/.rvm/rubies/ruby-1.9.3-p362/lib/ruby/1.9.1/net/http.rb:2735:in `body'
    from /Users/lei/.rvm/rubies/ruby-1.9.3-p362/lib/ruby/1.9.1/net/http.rb:2672:in `reading_body'
    from /Users/lei/.rvm/rubies/ruby-1.9.3-p362/lib/ruby/1.9.1/net/http.rb:1321:in `block in transport_request'
    from /Users/lei/.rvm/rubies/ruby-1.9.3-p362/lib/ruby/1.9.1/net/http.rb:1316:in `catch'
    from /Users/lei/.rvm/rubies/ruby-1.9.3-p362/lib/ruby/1.9.1/net/http.rb:1316:in `transport_request'
    from /Users/lei/.rvm/rubies/ruby-1.9.3-p362/lib/ruby/1.9.1/net/http.rb:1293:in `request'
    from /Users/lei/.rvm/rubies/ruby-1.9.3-p362/lib/ruby/1.9.1/net/http.rb:1286:in `block in request'
    from /Users/lei/.rvm/rubies/ruby-1.9.3-p362/lib/ruby/1.9.1/net/http.rb:745:in `start'
    from /Users/lei/.rvm/rubies/ruby-1.9.3-p362/lib/ruby/1.9.1/net/http.rb:1284:in `request'
    from /Users/lei/workspace/Dadiaosi/scraper.rb:18:in `<top (required)>'
    from -e:1:in `load'
    from -e:1:in `<main>'

你们有什么线索吗?

4

2 回答 2

1

这些工作:

在终端:

$ curl -v http://t66y.com/thread0806.php?fid=16

在红宝石中:

require 'open-uri'
response = open("http://t66y.com/thread0806.php?fid=16")
html = response.read

从 curl 响应中,我可以看到标题,并且缺少内容长度并且字符集是中文。如果您使用的是旧版本的 ruby​​,这可能会触发 ruby​​ net http 库。

您可以轻松地交换 open-uri 以获取如上所示的 html。

于 2014-02-12T15:36:12.147 回答
0

它应该是

uri = URI('http://t66y.com/thread0806.php?fid=16')
response = Net::HTTP.get(uri)
于 2014-02-12T15:39:43.480 回答