2

我开始查看 elasticsearch,我想知道是否可以使用它来完成此操作:(我进行了一些搜索,但我承认我不知道要查找什么)。

我有如下两个联系人数据:

{
  "id"     : "id1",
  "name"   : "Roger",
  "phone1" : "123",
  "phone2" : "",
  "phone3" : "980"
}

{
  "id"     : "id2",
  "name"   : "Lucas",
  "phone1" : "789",
  "phone2" : "123",
  "phone3" : ""
}

我很想知道弹性搜索是否可以帮助我找到重复的电话号码,即使它们位于不同的电话字段中(“123”在这两个记录中都存在)。我已经看到我可以在多个字段中搜索一个字符串,所以如果我搜索 123,我可以得到这两条记录作为结果。但是,我希望能够发出一个可以返回给我的请求,如下所示:

{
  "phones" : {
    "123" : ["id1", "id2"],
    "980" : ["id1"],
    "789" : ["id2"]
  }
}

甚至这会很有用(与号码的联系人数量):

{
  "phones" : {
    "123" : 2,
    "980" : 1,
    "789" : 1
  }
}

知道这是否可能吗?如果它能够做到,那就太棒了。

4

2 回答 2

4

我同意 DrTech 更改数据结构的建议。但是,如果您出于某种原因更愿意保持原样,则可以使用多字段术语方面实现相同的结果:

curl "localhost:9200/phonefacet/_search?pretty=true&search_type=count" -d '{
    "query" : {
        "match_all" : {  }
    },
    "facets" : {
        "tag" : {
            "terms" : {
                "fields" : ["phone1", "phone2", "phone3"],
                "size" : 10
            }
        }
    }
}'

结果将如下所示:

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 2,
    "max_score" : 0.0,
    "hits" : [ ]
  },
  "facets" : {
    "tag" : {
      "_type" : "terms",
      "missing" : 2,
      "total" : 4,
      "other" : 0,
      "terms" : [ {
        "term" : "123",
        "count" : 2
      }, {
        "term" : "980",
        "count" : 1
      }, {
        "term" : "789",
        "count" : 1
      } ]
    }
  }
}
于 2012-07-19T14:25:00.340 回答
1

您可以使用terms facet到达那里,但您必须更改数据结构以将所有电话号码包含在单个字段中:

创建索引:

curl -XPUT 'http://127.0.0.1:9200/test/?pretty=1' 

索引您的数据:

curl -XPOST 'http://127.0.0.1:9200/test/test?pretty=1'  -d '
{
   "name" : "Roger",
   "id" : "id1",
   "phone" : [
      "123",
      "980"
   ]
}
'

curl -XPOST 'http://127.0.0.1:9200/test/test?pretty=1'  -d '
{
   "name" : "Lucas",
   "id" : "id2",
   "phone" : [
      "789",
      "123"
   ]
}
'

搜索所有字段,返回 中的项数phone

curl -XGET 'http://127.0.0.1:9200/test/test/_search?pretty=1'  -d '
{
   "facets" : {
      "phone" : {
         "terms" : {
            "field" : "phone"
         }
      }
   }
}
'

# {
#    "hits" : {
#       "hits" : [
#          {
#             "_source" : {
#                "name" : "Roger",
#                "id" : "id1",
#                "phone" : [
#                   "123",
#                   "980"
#                ]
#             },
#             "_score" : 1,
#             "_index" : "test",
#             "_id" : "StaJK9A5Tc6AR7zXsEKmGA",
#             "_type" : "test"
#          },
#          {
#             "_source" : {
#                "name" : "Lucas",
#                "id" : "id2",
#                "phone" : [
#                   "789",
#                   "123"
#                ]
#             },
#             "_score" : 1,
#             "_index" : "test",
#             "_id" : "x8w39F-DR9SZOQoHpJw2FQ",
#             "_type" : "test"
#          }
#       ],
#       "max_score" : 1,
#       "total" : 2
#    },
#    "timed_out" : false,
#    "_shards" : {
#       "failed" : 0,
#       "successful" : 5,
#       "total" : 5
#    },
#    "facets" : {
#       "phone" : {
#          "other" : 0,
#          "terms" : [
#             {
#                "count" : 2,
#                "term" : "123"
#             },
#             {
#                "count" : 1,
#                "term" : "980"
#             },
#             {
#                "count" : 1,
#                "term" : "789"
#             }
#          ],
#          "missing" : 0,
#          "_type" : "terms",
#          "total" : 4
#       }
#    },
#    "took" : 5
# }
于 2012-07-19T13:20:42.893 回答