假设我有这个样本:
page = "<html><body><h1 class='foo'></h1><p class='foo'>hello people<a href='http://'>hello world</a></p></body></html>"
@nodes = []
Nokogiri::HTML(page).traverse do |n|
if n[:class] == "foo"
@nodes << {:name => n.name, :xpath => n.path, :text => n.text }
end
end
结果将n.text
是hello peoplehello world
,我想以某种方式做到这一点,以便我可以获得父文本及其子文本,但将它们与它们的标签相关联
所以结果会是这样的
@nodes[0][:text]=""
@node[1][:text]= [{:Elementtext1 => "hello people", :ElementObject1 => elementObject},{:Elementtext2 => "hello world", :ElementObject2 => elementObject}]