0

我们正在运行 Elasticsearch 1.7(计划很快升级),我正在尝试使用分析 API 来了解不同分析器的作用,但从 elasticsearch 呈现的结果不是我所期望的。

如果我对我们的 elasticsearch 实例运行以下查询

GET _analyze
{
  "analyzer": "stop", 
  "text": "Extremely good food! We had the happiest waiter and the crowd's always flowing!"
}

我会得到这个结果

{
   "tokens": [
  {
     "token": "analyzer",
     "start_offset": 6,
     "end_offset": 14,
     "type": "<ALPHANUM>",
     "position": 1
  },
  {
     "token": "stop",
     "start_offset": 18,
     "end_offset": 22,
     "type": "<ALPHANUM>",
     "position": 2
  },
  {
     "token": "text",
     "start_offset": 30,
     "end_offset": 34,
     "type": "<ALPHANUM>",
     "position": 3
  },
  {
     "token": "extremely",
     "start_offset": 38,
     "end_offset": 47,
     "type": "<ALPHANUM>",
     "position": 4
  },
  {
     "token": "good",
     "start_offset": 48,
     "end_offset": 52,
     "type": "<ALPHANUM>",
     "position": 5
  },
  {
     "token": "food",
     "start_offset": 53,
     "end_offset": 57,
     "type": "<ALPHANUM>",
     "position": 6
  },
  {
     "token": "we",
     "start_offset": 59,
     "end_offset": 61,
     "type": "<ALPHANUM>",
     "position": 7
  },
  {
     "token": "had",
     "start_offset": 62,
     "end_offset": 65,
     "type": "<ALPHANUM>",
     "position": 8
  },
  {
     "token": "the",
     "start_offset": 66,
     "end_offset": 69,
     "type": "<ALPHANUM>",
     "position": 9
  },
  {
     "token": "happiest",
     "start_offset": 70,
     "end_offset": 78,
     "type": "<ALPHANUM>",
     "position": 10
  },
  {
     "token": "waiter",
     "start_offset": 79,
     "end_offset": 85,
     "type": "<ALPHANUM>",
     "position": 11
  },
  {
     "token": "and",
     "start_offset": 86,
     "end_offset": 89,
     "type": "<ALPHANUM>",
     "position": 12
  },
  {
     "token": "the",
     "start_offset": 90,
     "end_offset": 93,
     "type": "<ALPHANUM>",
     "position": 13
  },
  {
     "token": "crowd's",
     "start_offset": 94,
     "end_offset": 101,
     "type": "<ALPHANUM>",
     "position": 14
  },
  {
     "token": "always",
     "start_offset": 102,
     "end_offset": 108,
     "type": "<ALPHANUM>",
     "position": 15
  },
  {
     "token": "flowing",
     "start_offset": 109,
     "end_offset": 116,
     "type": "<ALPHANUM>",
     "position": 16
  }
   ]
}

这对我来说没有意义。我正在使用停止分析器,为什么结果中有“and”和“the”这两个词?我试图将停止分析器更改为空白和标准,但我得到与上面完全相同的结果。它们之间没有区别。但是,如果我对 Elasticsearch 5.x 的实例运行完全相同的查询,结果将不再包含“and”和“the”,而且看起来更符合预期。

这是因为我们使用的是 1.7 还是我们的 Elasticsearch 设置中的某些东西导致了这个问题?

编辑: 我在 chrome 中使用 Sense 插件进行查询,该插件不支持带有请求正文的 GET,因此它将请求更改为 POST。Elastic Analyze API 1.7 似乎不支持 POST 请求 :( 如果我像这样更改查询 GET _analyze?analyzer=stop&text=THIS+is+a+test&pretty 它可以工作

4

1 回答 1

2

在 1.x 中,语法与2.x 和 5.x不同。根据 1.x文档,您应该_analyze像这样使用 API:

GET _analyze?analyzer=stop
{
  "text": "Extremely good food! We had the happiest waiter and the crowd's always flowing!"
}
于 2017-04-05T11:04:45.420 回答