elasticsearch - 使用持久性模型从一个索引导入到新索引

Question

我有一个应用程序，它有一个 Nutch 爬虫将结果直接发送到由轮胎持久性模型创建的 ElasticSearch 索引。

我正在寻找最好的方法来更改不需要删除索引的索引，然后重新创建它并重新填充它，因为索引是主数据源。我一直在尝试使该方法在您的索引是别名的情况下工作，然后将索引与别名关联，然后从主索引导入到新索引。

我一直在尝试获取rake environment tire:import CLASS='Applicant' INDEX='index_new'命令以使用这种方法完成工作，但没有取得任何成功，因为它首先由于未定义的方法“分页”而在导入时失败，然后在我定义了“分页”方法之后我的模型，它从一个未定义的方法“计数”失败，它在轮胎 0.60.0/lib/tire/model/import.rb:102 命中。

几天来，我一直在寻找正确的方法，但此时我并不相信我现在完全走在正确的道路上。我在下面包含了我的模型以供参考。我正在使用 WillPaginate 进行分页。

class Applicant
  include Tire::Model::Persistence
  include Tire::Model::Search
  include Tire::Model::Callbacks  

  require 'will_paginate'
  require 'will_paginate-bootstrap'
  require 'will_paginate/array'

  index_name 'index'
  document_type 'doc'

 mapping  
    indexes :boost, type: 'string'
    indexes :content, type: 'string'
    indexes :digest, type: 'string'          
    indexes :id, type: 'string'
    indexes :skill, type: 'string'
    indexes :title, type: 'string'
    indexes :tstamp, type: 'date', format: 'dateOptionalTime'
    indexes :url, type: 'string'
    indexes :domain, type 'string'

 property :boost
 property :content
 property :digest  
 property :id
 property :skill 
 property :title    
 property :tstamp  
 property :url
 property :domain

  def self.search(params)
   tire.search(page: params[:page], per_page: 20)do
      query { string params[:query], default_operator: "AND" } if params[:query].present?
      filter :term, domain: params[:domain_selected]  if params[:domain_selected].present?
      filter :term, skill: params[:skill_selected]  if params[:skill_selected].present?
      facet "domains" do
        terms :domain
      end 
      facet "skills" do
        terms :skill
      end
    end
  end 

  def self.paginate(params)
   @page_results = WillPaginate::Collection.create(params[:page], per_page, total_entries) do |pager|
     pager.replace(@self.to_array)
   end
   @page_results = @self.paginate(params[:current_page], params[:per_page])
  end
end

顺便说一句，但我的优先级也较低，我一直在研究代码，试图了解为什么导入需要分页，但我不清楚。

提前致谢。

score 0 · Accepted Answer

因此，经过 2 周的搜索，我找到了我正在寻找的解决方案。我基本上完成了我正在寻找的相同结果，Article.create_elasticsearch_index然后使用Tire.index('original-index-name').reindex 'new-index-name'. Karmi 的推文让我找到了正确的解决方案。

https://twitter.com/karmiq/status/185811361069142016

我也在努力调整 jarosan 在这里的工作以适应我的情况，并将很快发布。

https://gist.github.com/3124884

谢谢米歇尔和卡雷尔。

score 0 · Accepted Answer

好吧，您收到该错误的原因是，在您看来，我猜您指的是分页宝石。

首先要做的是检查您的视图，并从视图和控制器中剥离分页，或者，如果您需要分页，请执行以下简单测试：

您的应用程序应该加载 will_paginate gem。要查看库是否已加载，请打开应用程序的控制台并尝试以下行：

定义？将分页 ActiveRecord::Base.respond_to？:paginate 如果这些行中的任何一行返回 nil/false，则 will_paginate 没有正确加载到您的应用程序中。

（（来自https://github.com/mislav/will_paginate/wiki/Troubleshooting））

如果失败，请确保您的 Gemfile 中有以下两行：

gem 'will_paginate', '~> 3.0.3'
gem 'bootstrap-will_paginate', '~> 0.0.6'

如果这对您不起作用，请告诉我，我们将深入挖掘。

elasticsearch - 使用持久性模型从一个索引导入到新索引

2 回答 2

Related

Reference