0

我正在尝试使用弹性搜索术语查询对包含 URL 的字段进行搜索。我使用 elasticsearch-rails ActiveRecord 持久模式。这就是我尝试做到的方式。

total_views = UserAction.search :query=> {
        :filtered=> {
            :filter=> {
                :term=> { action_path:"http://0.0.0.0:3000/tshirt/test" } 
            }
        }
    }  

如果没有 '/' 或 ':' 字符,它可以工作。例如,当 action_path 只是“tshirt”时。其他字段不进行分析,如果字段中没有“/”、“:”等字符,它们将起作用。所以显然弹性搜索试图分析它,但问题是它们不应该被分析,因为映射已经存在。

这是我的用户操作类

class UserAction
  include Elasticsearch::Persistence::Model  
  extend Calculations
  include Styles

  attribute :user_id, Integer
    attribute :user_referrer, String, mapping: { index: 'not_analyzed' } 
    attribute :user_ip, String, mapping: { index: 'not_analyzed' } 
    attribute :user_country, String, mapping: { index: 'not_analyzed' }
    attribute :user_city, String, mapping: { index: 'not_analyzed' }
    attribute :user_device, String, mapping: { index: 'not_analyzed' }
  attribute :user_agent, String, mapping: { index: 'not_analyzed' }
    attribute :user_platform
  attribute :user_visitid, Integer
    attribute :action_type, String, mapping: { index: 'not_analyzed' } 
    attribute :action_css, String, mapping: { index: 'not_analyzed' }
  attribute :action_text, String, mapping: { index: 'not_analyzed' }
  attribute :action_path, String, mapping: { index: 'not_analyzed' } 
  attribute :share_url, String, mapping: { index: 'not_analyzed' } 
  attribute :tag 
  attribute :date 

我还尝试使用“mapping do..”添加索引,然后使用“create_index!”但结果是一样的。因为有映射,所以它确实创建了映射。

这是我的宝石文件

   gem "elasticsearch-model", git: "git://github.com/elasticsearch/elasticsearch-rails.git", require: "elasticsearch/model"
          gem "elasticsearch-persistence", git: "git://github.com/elasticsearch/elasticsearch-rails.git", require: "elasticsearch/persistence/model"
          gem "elasticsearch-rails"

当我进行搜索时,我还看到那些未分析的字段。

       :reload_on_failure=>false,
         :randomize_hosts=>false,
         :transport_options=>{}},
       @protocol="http",
       @reload_after=10000,
       @resurrect_after=60,
       @serializer=
        #<Elasticsearch::Transport::Transport::Serializer::MultiJson:0x007fc4bf9e0e18
         @transport=#<Elasticsearch::Transport::Transport::HTTP::Faraday:0x007fc4bf9b35a8 ...>>,
       @sniffer=
        #<Elasticsearch::Transport::Transport::Sniffer:0x007fc4bf9e0dc8
         @timeout=1,
         @transport=#<Elasticsearch::Transport::Transport::HTTP::Faraday:0x007fc4bf9b35a8 ...>>,
       @tracer=nil>>,
   @document_type="user_action",
   @index_name="useraction",
   @klass=UserAction,
   @mapping=
    #<Elasticsearch::Model::Indexing::Mappings:0x007fc4bfab18d8
     @mapping=
      {:created_at=>{:type=>"date"},
       :updated_at=>{:type=>"date"},
       :user_id=>{:type=>"integer"},
       :user_referrer=>{:type=>"string"},
       :user_ip=>{:type=>"string"},
       :user_country=>{:type=>"string", :index=>"not_analyzed"},
       :user_city=>{:type=>"string", :index=>"not_analyzed"},
       :user_device=>{:type=>"string", :index=>"not_analyzed"},
       :user_agent=>{:type=>"string", :index=>"not_analyzed"},
       :user_platform=>{:type=>"string"},
       :user_visitid=>{:type=>"integer"},
       :action_type=>{:type=>"string", :index=>"not_analyzed"},
       :action_css=>{:type=>"string", :index=>"not_analyzed"},
       :action_text=>{:type=>"string", :index=>"not_analyzed"},
       :action_path=>{:type=>"string", :index=>"not_analyzed"}},
     @options={},
     @type="user_action">,
   @options={:host=>UserAction}>,
 @response={"took"=>1, "timed_out"=>false, "_shards"=>{"total"=>4, "successful"=>4, "failed"=>0}, "hits"=>{"total"=>0, "max_score"=>nil, "hits"=>[]}}>
(END) 

初始化文件除了 elastichq 连接 url 之外什么都没有。

数据在 elastichq 中,所以我应该得到结果,但什么也得不到。

    user_action 1   AUzH9xKDueQ8OtBQuyQC    http://example.org/api/analytics/track
user_actions    user_action 1   AUzIAUsvueQ8OtBQuyQg    http://0.0.0.0:3000/tshirt/funnel_test2
user_actions    user_action 1   AUzH7ay5ueQ8OtBQuyP2    http://example.org/api/analytics/track
user_actions    user_action 1   AUzH-HAdueQ8OtBQuyQU    http://0.0.0.0:3000/tshirt/test
user_actions    user_action 1   AUzIJbCGueQ8OtBQuyQ4    http://example.org/api/analytics/track
user_actions    user_action 1   AUzIJbCjueQ8OtBQuyQ5    http://example.org/api/analytics/track

来自 Elastichq 的卷曲结果

curl -XGET "https://YYYYY:XXXXX@xxxx.qbox.io/user_actions/_mapping"
{
  "user_actions": {
    "mappings": {
      "user_action": {
        "properties": {
          "action_css": { "type": "string" },
          "action_path": { "type": "string" },
          "action_text": { "type": "string" },
          "action_type": { "type": "string" },
          "created_at": { "format": "dateOptionalTime", "type": "date" },
          "date": { "type": "string" },
          "share_url": { "type": "string" },
          "tag": { "type": "string" },
          "updated_at": { "format": "dateOptionalTime", "type": "date" },
          "user_agent": { "type": "string" },
          "user_city": { "type": "string" },
          "user_country": { "type": "string" },
          "user_device": { "type": "string" },
          "user_id": { "type": "long" },
          "user_ip": { "type": "string" },
          "user_referrer": { "type": "string" },
          "user_visitid": { "type": "long" }
        }
      }
    }
  }
}

任何人都可以帮助我进行 url 术语搜索工作吗?

4

4 回答 4

2

从最后的 elasticsearch curl 看来,您的字段已被分析(没有not_analyzed标志)。也许尝试使用您想要的映射重建您的索引。

于 2015-04-23T15:02:49.463 回答
1

我做了我不想做的事。使用以下发布请求手动创建了索引及其映射,因此 elasticsearch-rails 不能错误地创建它。现在一切正常

curl -XPOST https://xxxxxx.qbox.io/user_actions -d '{
    "settings" : {
        "number_of_shards" : 1
    },
    "mappings" : {
        "user_action" : {
            "_source" : { "enabled" : false },
            "properties" : {
                "action_path" : { "type" : "string", "index" : "not_analyzed" }
            }
        }
    }
}'
于 2015-04-26T17:18:10.913 回答
0

根据经验,如果您想搜索某些内容,则不应离开它not_analyzed

特别是在这种情况下,您绝对应该尝试使用Keyword Analyzer,将相关字段映射设置为keyword

只要您搜索完整的字符串,即 ,关键字分析器"http://0.0.0.0:3000/tshirt/test"就有很大的机会可以解决问题。

于 2015-04-24T16:57:03.597 回答
0

尝试原始查询:

total_views = UserAction.search :query=> {
    :filtered=> {
        :filter=> {
            :term=> { "action_path.raw" => "http://0.0.0.0:3000/tshirt/test" } 
        }
    }
}  
于 2015-04-26T03:16:09.620 回答