我有一个与 Spring Data Solr 结合使用的嵌入式 solr 服务器。我有大约 600k 个文档占用 3GB。在启动期间,Solr 需要几分钟才能执行第一个查询。使用 VisualVM,我已经能够找到瓶颈,这似乎是加载 LZ4 解压缩需要很长时间从磁盘读取的第一个文档。跟踪如下所示:
searcherExecutor-5-thread-1
java.lang.Thread.run()
java.util.concurrent.ThreadPoolExecutor$Worker.run()
java.util.concurrent.ThreadPoolExecutor.runWorker()
java.util.concurrent.FutureTask.run()
java.util.concurrent.FutureTask$Sync.innerRun()
org.apache.solr.core.SolrCore$5.call()
org.apache.solr.handler.component.SuggestComponent$SuggesterListener.newSearcher()
org.apache.solr.spelling.suggest.SolrSuggester.reload()
org.apache.solr.spelling.suggest.SolrSuggester.build()
org.apache.lucene.search.suggest.Lookup.build()
org.apache.lucene.search.suggest.analyzing.AnalyzingSuggester.build()
org.apache.lucene.search.suggest.DocumentDictionary$DocumentInputIterator.next()
org.apache.lucene.index.IndexReader.document()
org.apache.lucene.index.BaseCompositeReader.document()
org.apache.lucene.index.SegmentReader.document()
org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument()
org.apache.lucene.codecs.compressing.CompressionMode$4.decompress()
org.apache.lucene.codecs.compressing.LZ4.decompress()
org.apache.lucene.store.BufferedIndexInput.readBytes()
org.apache.lucene.store.BufferedIndexInput.readBytes()
org.apache.lucene.store.BufferedIndexInput.refill()
org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput.readInternal()
java.io.RandomAccessFile.seek[native]()
我需要对象映射的存储字段。我不明白为什么在加载单个文档时需要进行如此多的解压缩。这就像减压查找表是巨大的。任何提示/建议?