tomcat7 - SOLR 4.1 内存不足错误提交数千个 Solr 文档后

Question

我们正在使用以下选项测试在 tomcat 7 和 java 7 中运行的 solr 4.1

JAVA_OPTS="-Xms256m -Xmx2048m -XX:MaxPermSize=1024m -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode -XX:+ParallelRefProcEnabled -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/home/ubuntu/OOM_HeapDump"

我们的源代码如下所示：

/**** START *****/
int noOfSolrDocumentsInBatch = 0;
for(int i=0 ; i<5000 ; i++) {
    SolrInputDocument solrInputDocument = getNextSolrInputDocument();
    server.add(solrInputDocument);
    noOfSolrDocumentsInBatch += 1;
    if(noOfSolrDocumentsInBatch == 10) {
        server.commit();
        noOfSolrDocumentsInBatch = 0;
    }
}
/**** END *****/

“getNextSolrInputDocument()”方法生成一个包含 100 个字段（平均）的 solr 文档。大约 50 个字段属于“text_general”类型。一些“test_general”字段由大约 1000 个单词组成，其余由几个单词组成。在总字段中，大约有 35-40 个多值字段（不是“text_general”类型）。

我们正在索引所有字段，但仅存储 8 个字段。在这 8 个字段中，两个是字符串类型，五个是长字段，一个是布尔值。所以我们的索引大小只有 394 MB。但是 OOM 时占用的 RAM 大约是 2.5 GB。为什么即使索引大小很小，内存也如此之高？内存中存储的是什么？我们的理解是，每次提交后，文档都会刷新到磁盘。因此，提交后不应在 RAM 中保留任何内容。

我们正在使用以下设置：

server.commit() set waitForSearcher=true and waitForFlush=true
solrConfig.xml has following properties set:
directoryFactory = solr.MMapDirectoryFactory
maxWarmingSearchers = 1
text_general data type is being used as supplied in the schema.xml with the solr setup.
maxIndexingThreads = 8(default)
<autoCommit>
    <maxTime>15000</maxTime>
    <openSearcher>false</openSearcher>
</autoCommit>

在提交大约 3990 个 solr 文档后，我们得到 Java heap Out Of Memory Error。一些来自分析器的内存转储快照已上传到以下链接。
http://s9.postimage.org/w7589t9e7/memorydump1.png
http://s7.postimage.org/p3abs6nuj/memorydump2.png

有人可以建议我们应该做些什么来最小化/优化我们案例中的内存消耗，原因是什么？还建议遵循 solrConfig.xml 参数的最佳值和原因

tomcat7 - SOLR 4.1 内存不足错误提交数千个 Solr 文档后

0 回答 0

Related

Reference