ruby-on-rails - 将抓取的数据输入数据库

Question

嘿嘿，

所以我构建了一个工作刮板并将文件添加到我的应用程序中。我现在正在尝试获取刮板中的信息并将其放入我的数据库中。我正在尝试使用 find_or_create 方法，但我不断收到以下错误。

 breads_scraper.rb:49:in `block in summary': uninitialized constant Scraper::Bread    (NameError)   
from /Users/Cameron/.rvm/gems/ruby-1.9.3-p392/gems/nokogiri-  1.5.9/lib/nokogiri/xml/node_set.rb:239:in `block in each'
from /Users/Cameron/.rvm/gems/ruby-1.9.3-p392/gems/nokogiri-1.5.9/lib/nokogiri/xml/node_set.rb:238:in `upto'
from /Users/Cameron/.rvm/gems/ruby-1.9.3-p392/gems/nokogiri-1.5.9/lib/nokogiri/xml/node_set.rb:238:in `each'
from breads_scraper.rb:24:in `map'
from breads_scraper.rb:24:in `summary'
from breads_scraper.rb:57:in `<class:Scraper>'
from breads_scraper.rb:9:in `<main>'

我的代码如下所示。我的理论是我使用 find_or_create 不正确，或者文件不知道如何到达面包方法和控制器。

require 'rubygems'
require 'nokogiri'
require 'open-uri'
require 'uri'
require 'json'

url = Nokogiri::HTML(open("http://en.wikipedia.org/wiki/List_of_breads"))

class Scraper 

def initialize
  @url = "http://en.wikipedia.org/wiki/List_of_breads"
  @nodes = Nokogiri::HTML(open(@url))

end

def summary

  bread_data = @nodes

  breads = bread_data.css('div.mw-content-ltr table.wikitable tr') 
     bread_data.search('sup').remove

    bread_hashes = breads.map {|x| 

      if content = x.css('td')[0]
        name = content.text
      end
       if content = x.css('td a.image').map {|link| link ['href']}
        image =content[0]
      end
      if content = x.css('td')[2]
        type = content.text
      end
       if content = x.css('td')[3]
        country = content.text
      end
       if content = x.css('td')[4]
        description =content.text
      end

   {
      :name => name,
      :image => image,
      :type => type,
      :country => country,
      :description => description,
    }
    Bread.find_or_create(:title => name, :description => description, :image_url => image, :country_origin => country, :type => type)

        }

   end


bready = Scraper.new
bready.summary
puts "atta boy"
end

谢谢！

score 2 · Accepted Answer

从 rake 任务中调用刮板。

lib/tasks/scraper.rake

  namespace :app do
    desc "Scrape breads"
    task :scrape_breads => :environment do
      Scraper.new.summary
    end
  end

现在，您可以按如下方式运行 rake 任务：

rake app:scrape_breads

score 0 · Accepted Answer

0

看起来面包类没有加载。

于 2013-04-26T01:19:40.767 回答

ruby-on-rails - 将抓取的数据输入数据库

2 回答 2

Related

Reference