require 'nokogiri'
doc = Nokogiri::HTML::Document.parse <<-_HTML_
This is just an example, how to remove the next sentence using nokogiri in Ruby.
Thank you for your help.
<strong> XXXX </strong>
<br />
I want to remove all the HTML after the strong XXXX
<br />
<br />
<strong> YYY </strong>
puts doc.at('//p/text()[1]').to_s.strip
# >> This is just an example, how to remove the next sentence using nokogiri in Ruby.
# >> Thank you for your help.
现在,如果您想从源 html 本身中删除不需要的 html 内容,那么您可以尝试以下操作:
require 'nokogiri'
doc = Nokogiri::HTML::Document.parse <<-_HTML_
This is just an example, how to remove the next sentence using nokogiri in Ruby.
Thank you for your help.
<strong> XXXX </strong>
<br />
I want to remove all the HTML after the strong XXXX
<br />
<br />
<strong> YYY </strong>
doc.xpath('//p/* | //p/text()').count # => 10
ndst = doc.search('//p/* | //p/text()')[1..-1]
puts doc.to_html
# >> <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
# >> <html><body><p>
# >> This is just an example, how to remove the next sentence using nokogiri in Ruby.
# >> Thank you for your help.
# >> </p></body></html>