嘿嘿,
所以我构建了一个工作刮板并将文件添加到我的应用程序中。我现在正在尝试获取刮板中的信息并将其放入我的数据库中。我正在尝试使用 find_or_create 方法,但我不断收到以下错误。
breads_scraper.rb:49:in `block in summary': uninitialized constant Scraper::Bread (NameError)
from /Users/Cameron/.rvm/gems/ruby-1.9.3-p392/gems/nokogiri- 1.5.9/lib/nokogiri/xml/node_set.rb:239:in `block in each'
from /Users/Cameron/.rvm/gems/ruby-1.9.3-p392/gems/nokogiri-1.5.9/lib/nokogiri/xml/node_set.rb:238:in `upto'
from /Users/Cameron/.rvm/gems/ruby-1.9.3-p392/gems/nokogiri-1.5.9/lib/nokogiri/xml/node_set.rb:238:in `each'
from breads_scraper.rb:24:in `map'
from breads_scraper.rb:24:in `summary'
from breads_scraper.rb:57:in `<class:Scraper>'
from breads_scraper.rb:9:in `<main>'
我的代码如下所示。我的理论是我使用 find_or_create 不正确,或者文件不知道如何到达面包方法和控制器。
require 'rubygems'
require 'nokogiri'
require 'open-uri'
require 'uri'
require 'json'
url = Nokogiri::HTML(open("http://en.wikipedia.org/wiki/List_of_breads"))
class Scraper
def initialize
@url = "http://en.wikipedia.org/wiki/List_of_breads"
@nodes = Nokogiri::HTML(open(@url))
end
def summary
bread_data = @nodes
breads = bread_data.css('div.mw-content-ltr table.wikitable tr')
bread_data.search('sup').remove
bread_hashes = breads.map {|x|
if content = x.css('td')[0]
name = content.text
end
if content = x.css('td a.image').map {|link| link ['href']}
image =content[0]
end
if content = x.css('td')[2]
type = content.text
end
if content = x.css('td')[3]
country = content.text
end
if content = x.css('td')[4]
description =content.text
end
{
:name => name,
:image => image,
:type => type,
:country => country,
:description => description,
}
Bread.find_or_create(:title => name, :description => description, :image_url => image, :country_origin => country, :type => type)
}
end
bready = Scraper.new
bready.summary
puts "atta boy"
end
谢谢!