让 Nokogiri 选择开始和停止元素(包括开始/停止元素)之间的所有内容的最聪明的方法是什么?
检查下面的示例代码以了解我在寻找什么:
require 'rubygems'
require 'nokogiri'
value = Nokogiri::HTML.parse(<<-HTML_END)
"<html>
<body>
<p id='para-1'>A</p>
<div class='block' id='X1'>
<p class="this">Foo</p>
<p id='para-2'>B</p>
</div>
<p id='para-3'>C</p>
<p class="that">Bar</p>
<p id='para-4'>D</p>
<p id='para-5'>E</p>
<div class='block' id='X2'>
<p id='para-6'>F</p>
</div>
<p id='para-7'>F</p>
<p id='para-8'>G</p>
</body>
</html>"
HTML_END
parent = value.css('body').first
# START element
@start_element = parent.at('p#para-3')
# STOP element
@end_element = parent.at('p#para-7')
结果(返回值)应如下所示:
<p id='para-3'>C</p>
<p class="that">Bar</p>
<p id='para-4'>D</p>
<p id='para-5'>E</p>
<div class='block' id='X2'>
<p id='para-6'>F</p>
</div>
<p id='para-7'>F</p>
更新:这是我目前的解决方案,但我认为必须有更聪明的东西:
@my_content = ""
@selected_node = true
def collect_content(_start)
if _start == @end_element
@my_content << _start.to_html
@selected_node = false
end
if @selected_node == true
@my_content << _start.to_html
collect_content(_start.next)
end
end
collect_content(@start_element)
puts @my_content