0

一旦我从网页上抓取数据并获得空白值,就无法获得准确的数据。以下是代码:

require 'nokogiri'
require 'open-uri'

number=1
url="http://www.jabong.com/109F/"

doc = Nokogiri::HTML(open(url))
puts doc.at_css("title").text
products=doc.css('.narrow')
products.each do |item|
product_name = item.at_css('.itm-title').text unless item.at_css('.itm-title').nil?
product_price = item.at_css('.itm-priceBox').text unless item.at_css('.itm-priceBox').nil?
puts product_name
puts product_price    
puts number
number+=1
end
puts "it is the end of code"
4

1 回答 1

1

我认为您应该使用 xpath 而不是 CSS。xpath 非常强大,可以让您在 DOM 中导航更容易抓取数据。

例如,获取提供的 URL 中所有服装的名称并输出一个数组:

doc.xpath("//ul[@id = 'productsCatalog']//li//a//span[@class='itm-title']").children.map{|x| x.text.gsub("\r\n                                                                                    ","").strip}

 => ["Ruffle Sleeves Self Pattern Green Top", "Cap Sleeve Solid Red Top", "3/4Th Sleeve Embroidered Blue Tunic", "Ruffle Sleeves Embroidered Beige Tunic", "3/4Th Sleeve Stripe Blue Tunic", "Ruffle Sleeves Printed Beige/Pink Tunic", "Short Sleeve Embroidered Black Tunic", "Sleeve Less Solid Black Top", "Short Sleeve Solid Black Top", "Puffed Sleeve Embroidered Black Top", "Sleeve Less Printed Cream Top", "Sleeve Less Solid Yellow Top", "3/4Th Sleeve Embroidered Black Top", "Sleeve Less Solid Off White Top", "Sleeve Less Solid Fuschia Top", "Mega Sleeves Embroidered Green T-Shirt", "Sleeve Less Solid Orange Tunic", "Puffed Sleeve Embroidered Black Top", "Mega Sleeves Printed Beige Top", "Mega Sleeves Printed Beige T-Shirt", "Puffed Sleeve Stripe Red Dress", "Puffed Sleeve Stripe Blue Top", "Puffed Sleeve Printed Yellow Top", "Cap Sleeves Printed Multi Dress", "Short Sleeve Solid Mustard Yellow Tunic", "Mega Sleeves Embroidered Red Tunic", "Short Sleeve Stripe Red Dress", "Puffed Sleeve Embroidered Beige Top", "Short Sleeve Solid Black Top", "Butterfly Sleeve Check Beige Top", "Short Sleeve Solid Black tunic", "Sleeve Less Embroidered Green Top", "Mega Sleeves Printed Cream Tunic", "Mega Sleeves Embroidered Black Tunic", "Ruffle Sleeves Printed Orange Tunic", "Mega Sleeves Solid Pink Top", "Ruffle Sleeves Solid Rust Dress", "Sleeve Less Solid Navy Blue Dress", "Sleeve Less Self Pattern White Top", "Puffed Sleeve Stripe Red Top", "Mega Sleeves Solid Pink Tunic", "Roll Up Sleeve Solid Off White Tunic", "Sleeve Less Pintucks White Top", "Puffed Sleeve Embroidered Off White Top", "Puffed Sleeve Printed Pink Top", "Raglan Sleeve Solid Black Top", "Mega Sleeves Printed Beige Tunic", "Sleeve Less Solid Yellow Top", "Puffed Sleeve Self Pattern Beige Top", "Cap Sleeve Printed Yellow Dress", "Sleeve Less Solid Pink Dress"] 
于 2013-09-26T12:20:20.487 回答