ruby - 最基本的 Nokogiri 程序失败——文档问题还是错误？

Question

我决定尝试一下 Nokogiri，并直接从http://nokogiri.rubyforge.org/nokogiri/Nokogiri.html复制以下程序（仅添加require 'rubygems'和I_KNOW_I_AM_USING_AN_OLD_AND_BUGGY_VERSION_OF_LIBXML2常量）：

require 'rubygems'
I_KNOW_I_AM_USING_AN_OLD_AND_BUGGY_VERSION_OF_LIBXML2 = 1
require 'nokogiri'
require 'open-uri'

# Get a Nokogiri::HTML:Document for the page we’re interested in...

doc = Nokogiri::HTML(open('http://www.google.com/search?q=tenderlove'))

# Do funky things with it using Nokogiri::XML::Node methods...

####
# Search for nodes by css
doc.css('h3.r a.l').each do |link|
  puts link.content
end

它没有返回任何结果。但是当我改变

    doc = Nokogiri::HTML(open('http://www.google.com/search?q=tenderlove'))

到

    doc = Nokogiri::HTML(open('http://www.google.com/search?q=tenderlove').read)

该程序按预期工作。请注意，唯一的区别是在行尾添加了 .read。我自己永远也想不通，因为几乎每一点示例代码都离开了 .read。具有讽刺意味的是，包含它的一个地方是 Nokogiri 开发人员之一的帖子（在http://tenderlovemaking.com/2008/11/18/underpant-free-excitement）。API 中的某些内容是否发生了变化？我错过了什么？

我正在使用 Nokogiri 1.3.2。

谢谢你。

score 0 · Accepted Answer

我将您的（原始）代码复制并粘贴到一个 Ruby 文件中，然后在我的系统上运行它（ruby 1.8.6p369，Nokogiri 1.3.2），它运行良好。您的环境中可能还有其他可能导致问题的东西吗？除了 Nokogiri，你会得到什么open('http://www.google.com/search?q=tenderlove')回报？

score 0 · Accepted Answer

不确定您的问题是什么，但调用open来自open-urinot nokogiri。所以做一些尝试让 nokogiri 退出游戏。

$ irb
>> require 'open-uri'
=> true
>> f = open('http://www.google.com/search?q=tenderlove')
=> #<File:/var/folders/LA/LACsuKOVHtaEgmBzsJcGAE+++TI/-Tmp-/open-uri.7455.0>
>> f.read
=> "<!doctype html><head><title>tenderlove - Google Search</title>...

score 0 · Accepted Answer

我升级到 Nokogiri 1.3.3，并将 libxml2 升级到 2.7.3。我不再需要使用荒谬的I_KNOW_I_AM_USING_AN_OLD_AND_BUGGY_VERSION_OF_LIBXML2 = 1语句来避免错误消息，并且程序可以在没有多余的 .read 的情况下运行。

score 0 · Accepted Answer

检查您的 Nokogiri 和 libxml 版本以确保它们是最新的总是好的。

截至今天（2009 年 9 月 22 日），这在 MacOS 上是最新的：

nokogiri -v
--- 
nokogiri: 1.3.3
warnings: [ ]

libxml: 
  compiled: 2.7.4
  loaded: 2.7.4
  binding: extension

（我在空警告数组中放置了一个空格，以防止它看起来像一个盒子。）

ruby - 最基本的 Nokogiri 程序失败——文档问题还是错误？

4 回答 4

Related

Reference