0

这让我很生气,我已经尽我所能。事情就是这样。我需要:

  1. 将所有看起来相似的俄语字母转换为英语(在分析和搜索请求时)
  2. 删除所有非字母和非数字
  3. 使用 ngram 大小写搜索制作标记可以来自字符串的任何位置

例如,您可以搜索 8009,而我有 ALK-8009 和 ALK-8022 的 sku,我不明白为什么 ALK-8022 会高于 ALK-8009。

      index_options: {
        settings: {
          index: {
            analysis: {
              char_filter: {
                russian_transliteration: {
                  type: "mapping",
                  mappings: ["а=>a",
                             "в=>b",
                             "е=>e",
                             "к=>k",
                             "м=>m",
                             "н=>h",
                             "о=>o",
                             "р=>p",
                             "с=>c",
                             "т=>t",
                             "у=>u",
                             "х=>x"]
                },
                pattern_replace_char_filter: {
                  type: "pattern_replace",
                  pattern: "(\\s|[-_\/\.\(\)])*",
                  replacement: ""
                }
              },
              tokenizer: {
                sku_tokenizer: {
                  type: "nGram",
                  min_gram: 4,
                  max_gram: 15
                },
                sku_search_tokenizer: {
                  type: "edgeNGram",
                  min_gram: 4,
                  max_gram: 15
                }
              },
              analyzer: {
                sku_analyzer: {
                  type: "custom",
                  tokenizer: "sku_tokenizer",
                  char_filter: ["russian_transliteration","pattern_replace_char_filter"],
                  filter: ['lowercase']

                }
                },
                sku_search_analyzer: {
                  type: "custom",
                  tokenizer: "sku_search_tokenizer",
                  char_filter: ["russian_transliteration","pattern_replace_char_filter"],
                  filter: ['lowercase']
                }
              }
            }
          }
        }
      },
      index_mappings: {
         sku: {
              type: 'string',
              analyzer: 'sku_analyzer',
              fields: {
                search: {type: 'string', analyzer: 'sku_search_analyzer'},
                suggest:  {type: 'completion'}
              }
          }
      }

这是我的搜索查询:

{query: 
  {bool: 
    {should: 
      [{prefix: {sku: {value: "SEARCH-STRING", boost: 2}}}, 
       {match: {sku: {query: "SEARCH-STRING", boost: 1, fuzziness: 0}}}]
}}}

我期望的只有那些在 SKU 中具有完整搜索字符串的结果,而不仅仅是部分。

例如,ALK-80 - 将转换为 alk80,只有这些结果才是我需要的。

4

0 回答 0