0

如何使用 nokogiri 刮取页面http://www.tradus.com/t-shirts-tees-reebok-puma-fifa-teesort/t/7682?Type=polo+neck的产品名称和价格以及如何刮掉该类别的所有产品,因为有分页。以下是我得到价格的代码,但在 HTML 标签中,仅适用于 1 页。

require 'nokogiri'
require 'open-uri'

url = "http://www.tradus.com/t-shirts-tees-reebok-puma-fifa-teesort/t/7682?  Type=polo+neck"
doc = Nokogiri::HTML(open(url))
doc.css(".prodListing-item").each do |dv|
product_name = dv.at_css('.prod-name').text unless dv.at_css(".prod-name").nil?
product_price = dv.at_css('.price-info span span:nth-child(2)').to_s 
puts product_name + product_price
end
4

1 回答 1

1
Following is the code which resolved the issue
require 'nokogiri'
require 'open-uri'

number=1
while true
url="http://www.tradus.com/t-shirts-tees-reebok-puma-fifa-teesort/t/7682?  Type=polo+neck&page=#{number}"
doc = Nokogiri::HTML(open(url))
products=doc.css(".prodListing-item")
break if products.size == 0
products.each do |item|
product_name = item.at_css('.prod-name').text unless item.at_css(".prod-name").nil?
product_price = item.at_css('.price-info span span:nth-child(2)').text unless     item.at_css(".price-info span span:nth-child(2)").nil?
puts product_name +"<==========>" +product_price
end
puts "page" +"#{number}"
number += 1

end
puts "exit of the while loop"         
于 2013-09-26T18:28:44.637 回答