我们目前正在运行 ColdFusion 8,但计划很快迁移到 ColdFusion 10。这一举措的最大问题之一是,我们运行的最重要的应用程序之一包括当前使用 Verity Collections 构建的全文文档搜索。它基本上允许用户搜索数百个 PDF 文档的文本内容。
我刚刚在我的开发 ColdFusion 9 实例中创建了一个新的 Solr 集合,并尝试使用每天运行的现有索引逻辑更新集合,以使用存储在本地服务器上的 PDF 文档更新集合F:\PDFS\[documentId].PDF
:
<cfsetting requesttimeout="3600" />
<cfquery name="getDocs" datasource="myDB">
SELECT DISTINCT
itemNo,
edition,
description,
status,
'F:\PDFs\'
CONCAT documentId
CONCAT '.PDF' AS document_file
FROM SKU_ATTRIBUTES
</cfquery>
<cfindex
query="getDocs"
collection="mysolrcollection"
action="refresh"
type="file"
key="document_file"
title="description"
custom1="itemNo"
custom2="status"
custom3="edition" />
它运行了大约 10 分钟,然后被轰炸,但出现以下异常:
Java_heap_space__javalangOutOfMemoryError_Java_heap_space___at_orgapacheluceneutilUnicodeUtilUTF16toUTF8UnicodeUtiljava236___at_orgapachelucenestoreIndexOutputwriteStringIndexOutputjava102___at_orgapacheluceneindexFieldsWriterwriteFieldFieldsWriterjava232___at_orgapacheluceneindexStoredFieldsWriterPerFieldprocessFieldsStoredFieldsWriterPerFieldjava56___at_orgapacheluceneindexDocFieldConsumersPerFieldprocessFieldsDocFieldConsumersPerFieldjava37___at_orgapacheluceneindexDocFieldProcessorPerThreadprocessDocumentDocFieldProcessorPerThreadjava234___at_orgapacheluceneindexDocumentsWriterupdateDocumentDocumentsWriterjava762___at_orgapacheluceneindexDocumentsWriterupdateDocumentDocumentsWriterjava745___at_orgapacheluceneindexIndexWriterupdateDocumentIndexWriterjava2215___at_orgapacheluceneindexIndexWriterupdateDocumentIndexWriterjava2187___at_orgapachesolrupdateDirectUpdateHandler2addDocDirectUpdateHandler2java238___at_orgapachesolrupdateprocessorRunUpdateProcessorprocessAddRunUpdateProcessorFactoryjava60___at_orgapachesolrhandlerXMLLoaderprocessUpdateXMLLoaderjava140___at_orgapachesolrhandlerXMLLoaderloadXMLLoaderjava69___at_orgapachesolrhandlerContentStreamHandlerBasehandleRequestBodyContentStreamHandlerBasejava54___at_orgapachesolrhandlerRequestHandlerBasehandleRequestRequestHandlerBasejava131___at_orgapachesolrcoreSolrCoreexecuteSolrCorejava1333___at_orgapachesolrservletSolrDispatchFilterexecuteSolrDispatchFilterjava303___at_orgapachesolrservletSolrDispatchFilterdoFilterSolrDispatchFilterjava232___at_orgmortbayjettyservletServletHandler$CachedChaindoFilterServletHandlerjava1089___at_orgmortbayjettyservletServletHandlerhandleServletHandlerjava365___at_orgmortbayjettysecuritySecurityHandlerhandleSecurityHandlerjava216___at_orgmortbayjettyservletSessionHandlerhandleSessionHandlerjava181___at_orgmortbayjettyhan
Java_heap_space__javalangOutOfMemoryError_Java_heap_space___at_orgapacheluceneutilUnicodeUtilUTF16toUTF8UnicodeUtiljava236___at_orgapachelucenestoreIndexOutputwriteStringIndexOutputjava102___at_orgapacheluceneindexFieldsWriterwriteFieldFieldsWriterjava232___at_orgapacheluceneindexStoredFieldsWriterPerFieldprocessFieldsStoredFieldsWriterPerFieldjava56___at_orgapacheluceneindexDocFieldConsumersPerFieldprocessFieldsDocFieldConsumersPerFieldjava37___at_orgapacheluceneindexDocFieldProcessorPerThreadprocessDocumentDocFieldProcessorPerThreadjava234___at_orgapacheluceneindexDocumentsWriterupdateDocumentDocumentsWriterjava762___at_orgapacheluceneindexDocumentsWriterupdateDocumentDocumentsWriterjava745___at_orgapacheluceneindexIndexWriterupdateDocumentIndexWriterjava2215___at_orgapacheluceneindexIndexWriterupdateDocumentIndexWriterjava2187___at_orgapachesolrupdateDirectUpdateHandler2addDocDirectUpdateHandler2java238___at_orgapachesolrupdateprocessorRunUpdateProcessorprocessAddRunUpdateProcessorFactoryjava60___at_orgapachesolrhandlerXMLLoaderprocessUpdateXMLLoaderjava140___at_orgapachesolrhandlerXMLLoaderloadXMLLoaderjava69___at_orgapachesolrhandlerContentStreamHandlerBasehandleRequestBodyContentStreamHandlerBasejava54___at_orgapachesolrhandlerRequestHandlerBasehandleRequestRequestHandlerBasejava131___at_orgapachesolrcoreSolrCoreexecuteSolrCorejava1333___at_orgapachesolrservletSolrDispatchFilterexecuteSolrDispatchFilterjava303___at_orgapachesolrservletSolrDispatchFilterdoFilterSolrDispatchFilterjava232___at_orgmortbayjettyservletServletHandler$CachedChaindoFilterServletHandlerjava1089___at_orgmortbayjettyservletServletHandlerhandleServletHandlerjava365___at_orgmortbayjettysecuritySecurityHandlerhandleSecurityHandlerjava216___at_orgmortbayjettyservletSessionHandlerhandleSessionHandlerjava181___at_orgmortbayjettyhan request: http://localhost:8983/solr/mysolrcollection/update?commit=true&waitFlush=false&waitSearcher=false&wt=javabin
当我在 ColdFusion Administrator 中查看 Solr 集合时,它比原始的 Verity 集合大得多 - 现有的 Verity 集合约为 84-85MB,包含 9000 多个文档,而这个为 1.3GB,只有 847 个文档。
此搜索功能对应用程序至关重要,我担心如果迁移到 Solr 不起作用,我们将不得不推迟升级到 CF10。