我正在尝试使用 Mechanize Gem 解析网站。到目前为止,这就是我所拥有的:
page = agent.get("http://www.greatgiftsformen.com/price-range-under-c-131_142.html?page=all")
page.parser.xpath('//tr[(((count(preceding-sibling::*) + 1) = 2) and parent::*)]//*[contains(concat( " ", @class, " " ), concat( " ", "productListing-data", " " ))]')[5]
我得到了这个产品的元素:
=> #<Nokogiri::XML::Element:0x81c175ec name="td" attributes=[#<Nokogiri::XML::Attr:0x81c17d58 name="valign" value="top">, #<Nokogiri::XML::Attr:0x81c17eac name="align" value="center">, #<Nokogiri::XML::Attr:0x81c17ec0 name="class" value="productListing-data">] children=[#<Nokogiri::XML::Element:0x805fa174 name="a" attributes=[#<Nokogiri::XML::Attr:0x81c13794 name="href" value="http://www.greatgiftsformen.com/gas-pump-retro-liquor-dispenser-p-249.html?osCsid=05f5dbb816874ece6db883c2c48d7ae1">] children=[#<Nokogiri::XML::Element:0x8068e270 name="img" attributes=[#<Nokogiri::XML::Attr:0x81c115ac name="src" value="product_thumb.php?img=images/prod/liquordisp-gas.jpg&w=160&h=160">, #<Nokogiri::XML::Attr:0x81c115c0 name="width" value="160">, #<Nokogiri::XML::Attr:0x81c115d4 name="height" value="160">, #<Nokogiri::XML::Attr:0x81c11714 name="border" value="0">, #<Nokogiri::XML::Attr:0x81c11728 name="alt" value="Gas Pump Retro Liquor Dispenser">, #<Nokogiri::XML::Attr:0x81c11750 name="title" value="Gas Pump Retro Liquor Dispenser">, #<Nokogiri::XML::Attr:0x81c11764 name="class" value="fotgal">]>]>]>
但是,当我尝试获取 href 时,我返回 nil:
url = item.attributes['href']
=> nil