1

我们目前正在运行 ColdFusion 8,但计划很快迁移到 ColdFusion 10。这一举措的最大问题之一是,我们运行的最重要的应用程序之一包括当前使用 Verity Collections 构建的全文文档搜索。它基本上允许用户搜索数百个 PDF 文档的文本内容。

我刚刚在我的开发 ColdFusion 9 实例中创建了一个新的 Solr 集合,并尝试使用每天运行的现有索引逻辑更新集合,以使用存储在本地服务器上的 PDF 文档更新集合F:\PDFS\[documentId].PDF

<cfsetting requesttimeout="3600" />

<cfquery name="getDocs" datasource="myDB">
    SELECT DISTINCT
        itemNo,
        edition,
        description,
        status,
        'F:\PDFs\'
            CONCAT documentId
            CONCAT '.PDF'   AS  document_file
    FROM    SKU_ATTRIBUTES
</cfquery>

<cfindex
    query="getDocs"
    collection="mysolrcollection"
    action="refresh"
    type="file"
    key="document_file"
    title="description"
    custom1="itemNo"
    custom2="status"
    custom3="edition" />

它运行了大约 10 分钟,然后被轰炸,但出现以下异常:

Java_heap_space__javalangOutOfMemoryError_Java_heap_space___at_orgapacheluceneutilUnicodeUtilUTF16toUTF8UnicodeUtiljava236___at_orgapachelucenestoreIndexOutputwriteStringIndexOutputjava102___at_orgapacheluceneindexFieldsWriterwriteFieldFieldsWriterjava232___at_orgapacheluceneindexStoredFieldsWriterPerFieldprocessFieldsStoredFieldsWriterPerFieldjava56___at_orgapacheluceneindexDocFieldConsumersPerFieldprocessFieldsDocFieldConsumersPerFieldjava37___at_orgapacheluceneindexDocFieldProcessorPerThreadprocessDocumentDocFieldProcessorPerThreadjava234___at_orgapacheluceneindexDocumentsWriterupdateDocumentDocumentsWriterjava762___at_orgapacheluceneindexDocumentsWriterupdateDocumentDocumentsWriterjava745___at_orgapacheluceneindexIndexWriterupdateDocumentIndexWriterjava2215___at_orgapacheluceneindexIndexWriterupdateDocumentIndexWriterjava2187___at_orgapachesolrupdateDirectUpdateHandler2addDocDirectUpdateHandler2java238___at_orgapachesolrupdateprocessorRunUpdateProcessorprocessAddRunUpdateProcessorFactoryjava60___at_orgapachesolrhandlerXMLLoaderprocessUpdateXMLLoaderjava140___at_orgapachesolrhandlerXMLLoaderloadXMLLoaderjava69___at_orgapachesolrhandlerContentStreamHandlerBasehandleRequestBodyContentStreamHandlerBasejava54___at_orgapachesolrhandlerRequestHandlerBasehandleRequestRequestHandlerBasejava131___at_orgapachesolrcoreSolrCoreexecuteSolrCorejava1333___at_orgapachesolrservletSolrDispatchFilterexecuteSolrDispatchFilterjava303___at_orgapachesolrservletSolrDispatchFilterdoFilterSolrDispatchFilterjava232___at_orgmortbayjettyservletServletHandler$CachedChaindoFilterServletHandlerjava1089___at_orgmortbayjettyservletServletHandlerhandleServletHandlerjava365___at_orgmortbayjettysecuritySecurityHandlerhandleSecurityHandlerjava216___at_orgmortbayjettyservletSessionHandlerhandleSessionHandlerjava181___at_orgmortbayjettyhan

Java_heap_space__javalangOutOfMemoryError_Java_heap_space___at_orgapacheluceneutilUnicodeUtilUTF16toUTF8UnicodeUtiljava236___at_orgapachelucenestoreIndexOutputwriteStringIndexOutputjava102___at_orgapacheluceneindexFieldsWriterwriteFieldFieldsWriterjava232___at_orgapacheluceneindexStoredFieldsWriterPerFieldprocessFieldsStoredFieldsWriterPerFieldjava56___at_orgapacheluceneindexDocFieldConsumersPerFieldprocessFieldsDocFieldConsumersPerFieldjava37___at_orgapacheluceneindexDocFieldProcessorPerThreadprocessDocumentDocFieldProcessorPerThreadjava234___at_orgapacheluceneindexDocumentsWriterupdateDocumentDocumentsWriterjava762___at_orgapacheluceneindexDocumentsWriterupdateDocumentDocumentsWriterjava745___at_orgapacheluceneindexIndexWriterupdateDocumentIndexWriterjava2215___at_orgapacheluceneindexIndexWriterupdateDocumentIndexWriterjava2187___at_orgapachesolrupdateDirectUpdateHandler2addDocDirectUpdateHandler2java238___at_orgapachesolrupdateprocessorRunUpdateProcessorprocessAddRunUpdateProcessorFactoryjava60___at_orgapachesolrhandlerXMLLoaderprocessUpdateXMLLoaderjava140___at_orgapachesolrhandlerXMLLoaderloadXMLLoaderjava69___at_orgapachesolrhandlerContentStreamHandlerBasehandleRequestBodyContentStreamHandlerBasejava54___at_orgapachesolrhandlerRequestHandlerBasehandleRequestRequestHandlerBasejava131___at_orgapachesolrcoreSolrCoreexecuteSolrCorejava1333___at_orgapachesolrservletSolrDispatchFilterexecuteSolrDispatchFilterjava303___at_orgapachesolrservletSolrDispatchFilterdoFilterSolrDispatchFilterjava232___at_orgmortbayjettyservletServletHandler$CachedChaindoFilterServletHandlerjava1089___at_orgmortbayjettyservletServletHandlerhandleServletHandlerjava365___at_orgmortbayjettysecuritySecurityHandlerhandleSecurityHandlerjava216___at_orgmortbayjettyservletSessionHandlerhandleSessionHandlerjava181___at_orgmortbayjettyhan request: http://localhost:8983/solr/mysolrcollection/update?commit=true&waitFlush=false&waitSearcher=false&wt=javabin 

当我在 ColdFusion Administrator 中查看 Solr 集合时,它比原始的 Verity 集合大得多 - 现有的 Verity 集合约为 84-85MB,包含 9000 多个文档,而这个为 1.3GB,只有 847 个文档。

此搜索功能对应用程序至关重要,我担心如果迁移到 Solr 不起作用,我们将不得不推迟升级到 CF10。

4

2 回答 2

0

这听起来像是一个一次性的导入过程。您是否尝试过将结果批处理到每次迭代可能有 500 个文档。根据我的经验,当页面超过 1 分钟时,Coldfusion 效果不佳。

于 2012-12-05T06:52:21.003 回答
0

确保您安装了 ColdFusion 9.0.1 的 ColdFusion Hotfix 2。

累积修补程序 2 | 冷融合 9.0.1

Hotfix 包括 Solr 的一些主要错误修复,尤其是在索引 .PDF 文件时。或者安装 ColdFusion 9.0.2,但它不再支持 Verity。因此,您将无法在 Verity 和 Solr 之间切换。

于 2012-12-05T08:09:36.810 回答