0

基于此 HTML:

<li><strong><a href="http://www.ukasta.org.uk/">United Kingdom Agricultural Supply Trade Association</a> (UKASTA)</strong></li>

我想得到United Kingdom Agricultural Supply TradeAssociation(UKASTA)字符串。

使用 Nokogiri,我写道:

linklist=link.parent.parent.css('li strong a')
linklist.each do |f|
  puts f.text
end

f.text是“英国农业供应贸易协会”,但我如何获得“(UKASTA)”?

4

2 回答 2

3

你潜得太深了。我会使用:

require 'nokogiri'

html = '<li><strong><a href="http://www.ukasta.org.uk/">United Kingdom Agricultural Supply Trade Association</a> (UKASTA)</strong></li>'
doc = Nokogiri::HTML(html)
doc.at('strong').text

返回:

"United Kingdom Agricultural Supply Trade Association (UKASTA)"

如果您必须找到该<a>节点,您可以使用以下命令访问“(UKASTA)”:

a_node = doc.at('a')
a_node.text
=> "United Kingdom Agricultural Supply Trade Association"
a_node.next_sibling.text
=> " (UKASTA)"
于 2013-04-26T00:48:16.687 回答
2

您可以使用该children方法,然后按位置识别数据:

require 'nokogiri'

html_doc = Nokogiri::HTML("<html><li><strong><a href="">United Kingdom Agricultural Supply Trade Association</a>(UKASTA)</strong></li></html>")

html_doc.css('li strong').children[0].text
=> United Kingdom Agricultural Supply Trade Association
html_doc.css('li strong').children[1]
=> (UKASTA)
于 2013-04-25T17:37:27.670 回答