我正在尝试使用 libxml 的 SAX 解析器(如此处所示),但我遇到了未定义的方法错误。
我的代码是
$domain_topics = Hash.new { |h,d| h[d] = [] }
parser = LibXML::XML::SaxParser.io(
File.open("content.rdf.u8", "r:UTF-8")
)
class Callbacks
include LibXML::XML::SaxParser::Callbacks
def initialize
@state = :top
@topics = nil
end
def on_start_element(element, attributes)
case @state
when :top
return unless element == 'ExternalPage'
@state = :ExternalPage
domain = attributes['about'].sub(%r!^\w+://([^"/]*)(?:/[^"]*)?$!, '\1')
@topics = $domain_topics[domain]
when :ExternalPage
return unless element == 'topic'
@state = :topic
end
end
def on_characters(characters)
if @state == :topic and @topics
@topics << characters
end
end
def on_end_element(element)
case @state
when :ExternalPage
@state = :top
@topics = nil
when :topic
@state = :ExternalPage
end
end
end
parser.callbacks = Callbacks
parser.parse
当我运行它时:
% ./my_awesome_code.rb
./my_awesome_code.rb:1337:in `parse': undefined method `on_start_document' for Callbacks:Class (NoMethodError)
我在这里做错了什么?不应该include
LibXML::XML::SaxParser::Callbacks
给出默认定义
on_start_document
吗?
irb 似乎证实了我的直觉:
1.9.3p194 :009 > Callbacks.instance_methods.include? :on_start_document
=> true