8

我在 play2 和 elasticsearch 中构建了一个小应用程序,它将为我的其他应用程序提供自动完成功能。是时候将我的 elasticsearch 实例投入生产了。

映射:

curl -XPUT 'http://127.0.0.1:9200/auto_complete/?pretty=1' -d '
{
    "mappings": {
        "search_word": {
            "_all": {
                "enabled": false
            },
            "properties": {
                "id": {
                    "type": "string"
                },
                "word": {
                    "fields": {
                        "ngrams": {
                            "type": "string",
                            "analyzer": "custom_ngram"
                        },
                        "full": {
                            "type": "string",
                            "search_analyzer": "custom_full",
                            "index_analyzer": "custom_full"
                        }
                    },
                    "type": "multi_field"
                },
                "word_type": {
                    "type": "string"
                }
            }
        }
    },
    "settings": {
        "analysis": {
            "filter": {
                "customnGram": {
                    "max_gram": 50,
                    "min_gram": 2,
                    "type": "edgeNGram"
                }
            },
            "analyzer": {
                "custom_ngram": {
                    "filter": [
                        "standard",
                        "lowercase",
                        "customnGram"
                    ],
                    "type": "custom",
                    "tokenizer": "standard"
                },
                "custom_full": {
                    "filter": [
                        "standard",
                        "lowercase"
                    ],
                    "type": "custom",
                    "tokenizer": "standard"
                }
            }
        }
    }
}
'

为您提供一些测试数据:

curl -XPOST 'http://127.0.0.1:9200/_bulk?pretty=1' -d '
{"index" : {"_index" : "auto_complete", "_type" : "search_word"}}
{"word" : "vvs", "word_type":"STRONG_SEARCH_WORD"}
{"index" : {"_index" : "auto_complete", "_type" : "search_word"}}
{"word" : "och VVS ab", "word_type":"WEAK_SEARCH_WORD"}
{"index" : {"_index" : "auto_complete", "_type" : "search_word"}}
{"word" : "vvs och rörjouren", "word_type":"NAME"}
{"index" : {"_index" : "auto_complete", "_type" : "search_word"}}
{"word" : "vvs & rörjouren", "word_type":"NAME"}
{"index" : {"_index" : "auto_complete", "_type" : "search_word"}}
{"word" : "rot och vvs", "word_type":"NAME"}
{"index" : {"_index" : "auto_complete", "_type" : "search_word"}}
{"word" : "vvsjouren", "word_type":"NAME"}
{"index" : {"_index" : "auto_complete", "_type" : "search_word"}}
{"word" : "vvs-jouren", "word_type":"NAME"}
'

为您提供的测试查询:

curl -XGET 'http://127.0.0.1:9200/auto_complete/search_word/_search?pretty=1' -d ' 
{
    "query": {
        "bool": {
            "should": [
                {
                    "text": {
                        "search_word.ngrams": {
                            "operator": "and",
                            "query": "vvs"
                        }
                    }
                },
                {
                    "text": {
                        "search_word.full": {
                            "boost": 1,
                            "query": "vvs"
                        }
                    }
                }
            ]
        }
    }
}
'

测试时我一直在默认模式下运行实例。目前我有大约 100 万份文档。

如果我做:

curl http://127.0.0.1:9200/auto_complete/_stats?pretty=1

我得到:

{
    "auto_complete": {
        "primaries": {
            "docs": {
                "count": 971133,
                "deleted": 0
            },
            "store": {
                "size": "224.6mb",
                "size_in_bytes": 235552784,
                "throttle_time": "0s",
                "throttle_time_in_millis": 0
            },
            "indexing": {
                "index_total": 971126,
                "index_time": "4m",
                "index_time_in_millis": 242450,
                "index_current": 0,
                "delete_total": 0,
                "delete_time": "0s",
                "delete_time_in_millis": 0,
                "delete_current": 0
            },
            "get": {
                "total": 0,
                "time": "0s",
                "time_in_millis": 0,
                "exists_total": 0,
                "exists_time": "0s",
                "exists_time_in_millis": 0,
                "missing_total": 0,
                "missing_time": "0s",
                "missing_time_in_millis": 0,
                "current": 0
            },
            "search": {
                "query_total": 45,
                "query_time": "1.1s",
                "query_time_in_millis": 1152,
                "query_current": 0,
                "fetch_total": 35,
                "fetch_time": "50ms",
                "fetch_time_in_millis": 50,
                "fetch_current": 0
            }
        },
        "total": {
            "docs": {
                "count": 971133,
                "deleted": 0
            },
            "store": {
                "size": "224.6mb",
                "size_in_bytes": 235552784,
                "throttle_time": "0s",
                "throttle_time_in_millis": 0
            },
            "indexing": {
                "index_total": 971126,
                "index_time": "4m",
                "index_time_in_millis": 242450,
                "index_current": 0,
                "delete_total": 0,
                "delete_time": "0s",
                "delete_time_in_millis": 0,
                "delete_current": 0
            },
            "get": {
                "total": 0,
                "time": "0s",
                "time_in_millis": 0,
                "exists_total": 0,
                "exists_time": "0s",
                "exists_time_in_millis": 0,
                "missing_total": 0,
                "missing_time": "0s",
                "missing_time_in_millis": 0,
                "current": 0
            },
            "search": {
                "query_total": 45,
                "query_time": "1.1s",
                "query_time_in_millis": 1152,
                "query_current": 0,
                "fetch_total": 35,
                "fetch_time": "50ms",
                "fetch_time_in_millis": 50,
                "fetch_current": 0
            }
        }
    }
}

我已经通读了配置,但我想要的是某种清单:

  1. 更改日志文件路径
  2. 由于您的索引看起来像 X,您应该将 -Xmx 和 -Xms 设置为 X 和 Y
  3. 由于您的索引看起来像 X,您应该使用 X 节点和 Y 副本
  4. 删除查询中的所有漂亮
  5. 对于您最常用的查询,您需要预热它们
  6. 如果不使用 _all 字段集 "_all": {"enabled": false}
  7. ?

所以我在这里寻找的是:你在转向生产时的故事以及你做了什么类型的配置来让你的索引顺利运行。您对我或正在投入生产的任何人有什么建议吗?

4

1 回答 1

2

您可以在这篇博文中找到“ELASTICSEARCH 飞行前检查表”:

http://asquera.de/opensource/2012/11/25/elasticsearch-pre-flight-checklist/

它涵盖了基本配置、内存设置、名称解析等等。

于 2013-01-04T17:32:57.157 回答