1

我的集群显示了很多 io-waits(大约 50%)。

我做了很多索引和重新索引。

我想也许 lucene 的重新索引是导致大量 IO 的原因。考虑提高 refresh_interval 或 index.translog 选项 - 这是正确的方法吗?

我的主要问题是我不知道如何找出我的设置是什么。

http://www.elasticsearch.org/guide/reference/api/admin-indices-update-settings/它列出了很多选项,当我使用时没有一个可用:

curl -xget 'http://localhost:9200/my_index/_settings'

如果使用默认值,它不会返回值(根据 kimchy 在这篇文章中的回答)

我只得到我明确设置的分片、副本的数量。elasticsearch.yml 文件没有说明默认值是什么。我怎么知道我的更改发生了,现在的值是什么?

非常感谢帮助,因为我找不到这方面的文档。

运行 hot_threads,我得到:

> curl -XGET 'http://localhost:9200/_nodes/hot_threads?threads=5'
::: [Gardener][CR0qQbtBRyeU94hltnnE7A][inet[/10.154.148.151:9300]]{aws_availability_zone=us-east-1d}

   50.6% (253.2ms out of 500ms) cpu usage by thread 'elasticsearch[Gardener][search][T#20]'
     10/10 snapshots sharing following 8 elements
       sun.misc.Unsafe.park(Native Method)
       java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
       java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
       java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
       java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       java.lang.Thread.run(Thread.java:722)

   32.9% (164.5ms out of 500ms) cpu usage by thread 'elasticsearch[Gardener][search][T#12]'
     10/10 snapshots sharing following 8 elements
       sun.misc.Unsafe.park(Native Method)
       java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
       java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
       java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
       java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       java.lang.Thread.run(Thread.java:722)

   29.1% (145.5ms out of 500ms) cpu usage by thread 'elasticsearch[Gardener][search][T#8]'
     2/10 snapshots sharing following 20 elements
       org.apache.lucene.search.MultiTermQueryWrapperFilter.getDocIdSet(MultiTermQueryWrapperFilter.java:111)
       org.apache.lucene.search.ConstantScoreQuery$ConstantWeight.scorer(ConstantScoreQuery.java:131)
       org.apache.lucene.search.FilteredQuery$RandomAccessFilterStrategy.filteredScorer(FilteredQuery.java:533)
       org.apache.lucene.search.FilteredQuery$1.scorer(FilteredQuery.java:133)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:609)
       org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:161)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:572)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:524)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:501)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:345)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:127)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:239)
       org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:141)
       org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:206)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:193)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$2.run(TransportSearchTypeAction.java:179)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       java.lang.Thread.run(Thread.java:722)
     8/10 snapshots sharing following 2 elements
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       java.lang.Thread.run(Thread.java:722)

   26.5% (132.7ms out of 500ms) cpu usage by thread 'elasticsearch[Gardener][search][T#11]'
     2/10 snapshots sharing following 15 elements
       org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:161)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:572)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:524)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:501)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:345)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:127)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:239)
       org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:141)
       org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:206)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:193)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$2.run(TransportSearchTypeAction.java:179)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       java.lang.Thread.run(Thread.java:722)
     8/10 snapshots sharing following 8 elements
       sun.misc.Unsafe.park(Native Method)
       java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
       java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
       java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
       java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       java.lang.Thread.run(Thread.java:722)

    4.2% (21.1ms out of 500ms) cpu usage by thread 'elasticsearch[Gardener][bulk][T#4]'
     10/10 snapshots sharing following 9 elements
       sun.misc.Unsafe.park(Native Method)
       java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
       org.elasticsearch.common.util.concurrent.jsr166y.LinkedTransferQueue.awaitMatch(LinkedTransferQueue.java:706)
       org.elasticsearch.common.util.concurrent.jsr166y.LinkedTransferQueue.xfer(LinkedTransferQueue.java:615)
       org.elasticsearch.common.util.concurrent.jsr166y.LinkedTransferQueue.take(LinkedTransferQueue.java:1109)
       java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       java.lang.Thread.run(Thread.java:722)

运行并等待:

> curl -XGET 'http://localhost:9200/_nodes/hot_threads?threads=3&type=wait'
::: [Gardener][CR0qQbtBRyeU94hltnnE7A][inet[/10.154.148.151:9300]]{aws_availability_zone=us-east-1d}

    0.0% (0s out of 500ms) wait usage by thread 'Reference Handler'
     10/10 snapshots sharing following 3 elements
       java.lang.Object.wait(Native Method)
       java.lang.Object.wait(Object.java:503)
       java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)

    0.0% (0s out of 500ms) wait usage by thread 'Finalizer'
     10/10 snapshots sharing following 4 elements
       java.lang.Object.wait(Native Method)
       java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
       java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
       java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:189)

    0.0% (0s out of 500ms) wait usage by thread 'Signal Dispatcher'
     unique snapshot
     unique snapshot
     unique snapshot
     unique snapshot
     unique snapshot
     unique snapshot
     unique snapshot
     unique snapshot
     unique snapshot
     unique snapshot

> curl -XGET 'http://localhost:9200/_nodes/hot_threads?threads=3&type=block'
::: [Gardener][CR0qQbtBRyeU94hltnnE7A][inet[/10.154.148.151:9300]]{aws_availability_zone=us-east-1d}

    0.0% (0s out of 500ms) block usage by thread 'Reference Handler'
     10/10 snapshots sharing following 3 elements
       java.lang.Object.wait(Native Method)
       java.lang.Object.wait(Object.java:503)
       java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)

    0.0% (0s out of 500ms) block usage by thread 'Finalizer'
     10/10 snapshots sharing following 4 elements
       java.lang.Object.wait(Native Method)
       java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
       java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
       java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:189)

    0.0% (0s out of 500ms) block usage by thread 'Signal Dispatcher'
     unique snapshot
     unique snapshot
     unique snapshot
     unique snapshot
     unique snapshot
     unique snapshot
     unique snapshot
     unique snapshot
     unique snapshot
     unique snapshot
4

1 回答 1

17

默认情况下,index.refresh_interval设置为 1s。您可以通过将其设置为 -1 来增加此间隔或禁用自动刷新。

curl -XPUT 'localhost:9200/my_index/_settings' -d '
{
    "index" : {
        "refresh_interval" : -1
    }
}
'

然而,在你开始搞乱设置之前,我建议你弄清楚这种高 I/O 的实际原因。运行hot_threads请求并检查线程大部分时间花在哪里。

于 2013-07-03T20:58:38.983 回答