ruby - 执行 XPath 搜索时 Nokogiri 不返回任何内容

Question

我需要从网页解析表格。我在使用 Ruby 和 Nokogiri 之前已经这样做了，但是这次我的方法不起作用。这就是我正在做的事情：

response = RestClient.get "http://www.webpage.com?page=0"
doc = Nokogiri::HTML(response.body,nil,'utf-8')
doc.remove_namespaces!
table = doc.xpath(".//*[@id='contsinderecha']/form/table/tbody/tr[4]/td/table/tbody/tr[5]/td/table")

table只是一个空数组。响应很好，如果我这样做，put response.body我会得到网页的正文。

另外，为了获得我正在使用 firebug 的 XPath。

知道可能会发生什么吗？

score 6 · Accepted Answer

您的问题的解决方案是删除tbodyxPath 中的部分，如“为什么这个 Nokogiri XPath 有一个空返回？ ”中所建议的那样。

Firefox 为您生成tbody元素，这就是它们出现在 Firefox 的 xPath 中的原因，但它们不是原始页面源的一部分。

尝试以下操作：

response = RestClient.get "http://www.buenosaires.gob.ar/areas/seguridad_justicia/seguridad_urbana/estaciones_servicio/buscador.php?&pag=0"
doc = Nokogiri::HTML(response.body,nil,'utf-8')
doc.remove_namespaces!
table = doc.xpath(".//*[@id='contsinderecha']/form/table/tr[4]/td/table/tr[5]/td/table")

score 3 · Accepted Answer

3

进入该表的正确方法是：

doc.at('table.contenido')

于 2013-04-15T01:50:50.603 回答

ruby - 执行 XPath 搜索时 Nokogiri 不返回任何内容

2 回答 2

Related

Reference