我有一个使用命令 page.css("table.vc_result span a") 获得的文件,我无法获取文件的第二个和第三个 Span 元素:
文件
<table border="0" bgcolor="#FFFFFF" onmouseout="resDef(this)" onmouseover="resEmp(this)" class="vc_result">
<tbody>
<tr>
<td width="260" valign="top">
<table>
<tbody>
<tr>
<td width="40%" valign="top"><span><a class="cAddName" href="/USA/Illinois/Chicago/Yellow+Page+Advertising+And+Telephone+Directory+Publica/gateway-megatech_13478733">
Gateway Megatech</a></span><br>
<span class="cAddText">P.O. BOX 99682, Chicago IL 60696</span></td>
</tr>
<tr>
<td><span class="cAddText">Cook County Illinois</span></td>
</tr>
<tr>
<td><span class="cAddCategory">Yellow Page Advertising And Telephone
Directory Publica Chicago</span></td>
</tr>
</tbody>
</table>
</td>
<td width="260">
<table align="center">
<tbody>
<tr>
<td>
<table>
<tbody>
<tr>
<td>
<div style=
"background: url('images/listings.png');background-position: -0px -0px; width: 16px; height: 16px">
</div>
</td>
<td><font style="font-weight:bold">847-506-7800</font></td>
</tr>
</tbody>
</table>
</td>
</tr>
<tr>
<td>
<table>
<tbody>
<tr>
<td>
<div style=
"background: url('images/listings.png');background-position: -0px -78px; width: 16px; height: 16px">
</div>
</td>
<td><a href=
"/USA/Illinois/Chicago/Yellow+Page+Advertising+And+Telephone+Directory+Publica/gateway-megatech_13478733"
class="cAddNearby">Businesses near 60696</a></td>
</tr>
</tbody>
</table>
</td>
</tr>
<tr>
<td>
<table>
<tbody>
<tr>
<td></td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
...这不是完整的文件,该文件中有更多的跨度条目。
我正在使用的代码能够找到确切的文本,但无法将其与嵌套元素 Span A 的文本相关联。
require 'rubygems'
require 'nokogiri'
require 'open-uri'
name="yellow"
city="Chicago"
state="IL"
burl="http://www.sitename.com/"
url="#{burl}Business_Listings.php?name=#{name}&city=#{city}&state=#{state}¤t=1&Submit=Search"
page = Nokogiri::HTML(open(url))
rows = page.css("table.vc_result span a")
rows.each do |arow|
if arow.text == "Gateway Megatech"
puts(arow.next_element.text)
puts("Capturing the next span text")
found="Got it"
break
else
puts("Found nothing")
found="None"
end
end