有一个调度器任务(每十分钟一次)将更改的数据从 Oracle 同步到 ElasticSearch,但偶尔会出现 SocketTimeOutException
java.net.SocketTimeoutException: 30,000 milliseconds timeout on connection http-outgoing-113 [ACTIVE]
at org.springframework.dao.support.PersistenceExceptionTranslationInterceptor.invoke(PersistenceExceptionTranslationInterceptor.java:139)
Caused by: java.net.SocketTimeoutException: 30,000 milliseconds timeout on connection http-outgoing-113 [ACTIVE]
并且从 indexing_slowlog 可以找到一些文档需要很长时间
[2020-03-07T13:30:48,768][WARN ][i.i.s.index ] [node-1] [person_company_post/7YNrrUTmRw-oEBTXIm9NWg] took[19.8s], took_millis[19857], type[_doc], id[30131539074-201942321], routing[], source[{...}]
[2020-03-07T14:42:37,713][WARN ][i.i.s.index ] [node-1] [company/KTngnM6ASD-_KdU0FFAWRA] took[59.9s], took_millis[59942], type[_doc], id[20019054258], routing[], source[{...}]
[2020-03-07T15:12:19,197][WARN ][i.i.s.index ] [node-1] [person_company_post/7YNrrUTmRw-oEBTXIm9NWg] took[44.4s], took_millis[44435], type[_doc], id[30134190855-20023574217], routing[], source[{"createtime":1562918929747,"post":"shareholder","personId":30134190855,"comId":20023574217}]
但它是批量更新,为什么只有一个批次的文档很长(每批次 100 个)?
BulkResponse bulkResponse = restHighLevelClient.bulk(bulkRequest, RequestOptions.DEFAULT);
以及什么原因导致仅索引一个文档需要这么长时间。
PS。
$ curl 'http://localhost:9200/_cat/indices?v'
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
green open company KTngnM6ASD-_KdU0FFAWRA 1 0 75070188 17704899 18.9gb 18.9gb
green open person_company_post 7YNrrUTmRw-oEBTXIm9NWg 2 0 148998126 9928098 14.3gb 14.3gb
15:12:19 期间的 gc 日志
[2020-03-05T15:12:14.801+0000][22539][safepoint ] Total time for which application threads were stopped: 0.0003225 seconds, Stopping threads took: 0.0000807 seconds
[2020-03-05T15:12:15.802+0000][22539][safepoint ] Application time: 1.0001466 seconds
[2020-03-05T15:12:15.802+0000][22539][safepoint ] Entering safepoint region: Cleanup
[2020-03-05T15:12:15.802+0000][22539][safepoint ] Leaving safepoint region
[2020-03-05T15:12:15.802+0000][22539][safepoint ] Total time for which application threads were stopped: 0.0006109 seconds, Stopping threads took: 0.0002480 seconds
[2020-03-05T15:12:15.880+0000][22539][safepoint ] Application time: 0.0776929 seconds
[2020-03-05T15:12:15.880+0000][22539][safepoint ] Entering safepoint region: GenCollectForAllocation
[2020-03-05T15:12:15.881+0000][22539][gc,start ] GC(345928) Pause Young (Allocation Failure)
[2020-03-05T15:12:15.881+0000][22539][gc,task ] GC(345928) Using 8 workers of 8 for evacuation
[2020-03-05T15:12:15.928+0000][22539][gc,age ] GC(345928) Desired survivor size 34865152 bytes, new threshold 3 (max threshold 6)
[2020-03-05T15:12:15.928+0000][22539][gc,age ] GC(345928) Age table with threshold 3 (max threshold 6)
[2020-03-05T15:12:15.928+0000][22539][gc,age ] GC(345928) - age 1: 24454792 bytes, 24454792 total
[2020-03-05T15:12:15.928+0000][22539][gc,age ] GC(345928) - age 2: 3261472 bytes, 27716264 total
[2020-03-05T15:12:15.928+0000][22539][gc,age ] GC(345928) - age 3: 8625680 bytes, 36341944 total
[2020-03-05T15:12:15.928+0000][22539][gc,age ] GC(345928) - age 4: 394608 bytes, 36736552 total
[2020-03-05T15:12:15.928+0000][22539][gc,age ] GC(345928) - age 5: 2261016 bytes, 38997568 total
[2020-03-05T15:12:15.928+0000][22539][gc,heap ] GC(345928) ParNew: 579714K->44446K(613440K)
[2020-03-05T15:12:15.928+0000][22539][gc,heap ] GC(345928) CMS: 7812027K->7812027K(16095680K)
[2020-03-05T15:12:15.928+0000][22539][gc,metaspace ] GC(345928) Metaspace: 103567K->103567K(1144832K)
[2020-03-05T15:12:15.928+0000][22539][gc ] GC(345928) Pause Young (Allocation Failure) 8195M->7672M(16317M) 47.638ms
[2020-03-05T15:12:15.928+0000][22539][gc,cpu ] GC(345928) User=0.29s Sys=0.00s Real=0.04s
[2020-03-05T15:12:15.928+0000][22539][safepoint ] Leaving safepoint region
[2020-03-05T15:12:15.928+0000][22539][safepoint ] Total time for which application threads were stopped: 0.0482442 seconds, Stopping threads took: 0.0001708 seconds
[2020-03-05T15:12:16.928+0000][22539][safepoint ] Application time: 1.0001231 seconds
[2020-03-05T15:12:16.929+0000][22539][safepoint ] Entering safepoint region: Cleanup
[2020-03-05T15:12:16.929+0000][22539][safepoint ] Leaving safepoint region
[2020-03-05T15:12:16.929+0000][22539][safepoint ] Total time for which application threads were stopped: 0.0004899 seconds, Stopping threads took: 0.0002200 seconds
[2020-03-05T15:12:17.929+0000][22539][safepoint ] Application time: 1.0001154 seconds
[2020-03-05T15:12:17.929+0000][22539][safepoint ] Entering safepoint region: Cleanup
[2020-03-05T15:12:17.929+0000][22539][safepoint ] Leaving safepoint region
[2020-03-05T15:12:17.930+0000][22539][safepoint ] Total time for which application threads were stopped: 0.0003597 seconds, Stopping threads took: 0.0000686 seconds
[2020-03-05T15:12:18.930+0000][22539][safepoint ] Application time: 1.0001200 seconds
[2020-03-05T15:12:18.930+0000][22539][safepoint ] Entering safepoint region: Cleanup
[2020-03-05T15:12:18.930+0000][22539][safepoint ] Leaving safepoint region
[2020-03-05T15:12:18.930+0000][22539][safepoint ] Total time for which application threads were stopped: 0.0003355 seconds, Stopping threads took: 0.0001001 seconds
[2020-03-05T15:12:19.930+0000][22539][safepoint ] Application time: 1.0001240 seconds
[2020-03-05T15:12:19.930+0000][22539][safepoint ] Entering safepoint region: Cleanup
[2020-03-05T15:12:19.931+0000][22539][safepoint ] Leaving safepoint region
[2020-03-05T15:12:19.931+0000][22539][safepoint ] Total time for which application threads were stopped: 0.0003592 seconds, Stopping threads took: 0.0000931 seconds
[2020-03-05T15:12:20.931+0000][22539][safepoint ] Application time: 1.0001112 seconds