pdf - 使用 cfsearch 和 SOLR 的 ColdFusion PDF 文件搜索速度极慢

Question

我有一个功能正常的 Adobe ColdFusion 应用程序，它通过 Solr 搜索索引大约 2k 个 PDF 文件并提供预期的结果 - 但是对集合的每个搜索查询通常需要 25-30 秒。

这就是我将 2k PDF 文件索引到 Solr 的方式：

<!--- query database files --->
<cfset getfiles = application.file.getfiles()>

<!--- create solr query set --->
<cfset filesQuery = QueryNew("
    fileUID
    , filepath
    , title
    , description
    , fileext
    , added
")>

<!--- create new file query with key path and download url --->
<cfoutput query="getfiles">
<cfset ext = trim(getfiles.fileext)>
<cfset path = expandpath('/docs/#fileUID#.#ext#')>

<cfscript>
    newRow = QueryAddRow(filesQuery);
    QuerySetCell(filesQuery, "fileUID","#fileUID#" );
    QuerySetCell(filesQuery, "filepath","#path#" );
    QuerySetCell(filesQuery, "title","#filename#" );
    QuerySetCell(filesQuery, "description","#description#" );
    QuerySetCell(filesQuery, "added","#added#" );
</cfscript>

</cfoutput>

<!--- index the bunch --->
<cfindex  
    query = "filesQuery" 
    collection = "resumes" 
    action = "update" 
    type = "file" 
    key = "filepath"     
    title = "title" 
    body = "title, description"
    custom1 = "fileext"
    custom2 = "added"
    category= "file"
    status = "filestatus">

这是搜索文件的方式以及（25-30 秒）Solr 搜索发生的位置：

<!--- imagine form with (form.search) for terms --->

<cfsearch name = "results" 
    collection = "resumes" 
    criteria = "#form.search#
    contextPassages = "1"
    contextBytes = "300"
    maxrows = "100"
    contextHighlightBegin = "<strong>"
    contextHighlightEnd = " </strong>">

<!--- show (results) query --->

关于项目的一些附加信息：所有文件的长度都小于 1 页，因此在向 Solr 创建索引结果时没有字符截断。我在 ColdFusion Administrator 中使用过 Solr 缓冲区限制，没有明显的时间变化（目前为 40）。我在使用 MS Server 2003、1.86 Xeon - Adobe ColdFusion 9.0.1 和 1GB RAM 的开发 VM 上。JVM 是 Sun Microsystems (14.3-b01)。几乎没有其他东西在服务器端运行，因此性能应该不受外部因素的影响。

它提供了预期的和完美的结果，只是不及时。

score 2 · Accepted Answer

您可以尝试使用CFSorLib。它使用 Solr API。您可能会通过绕过来获得性能提升<cfsearch>

pdf - 使用 cfsearch 和 SOLR 的 ColdFusion PDF 文件搜索速度极慢

1 回答 1

Related

Reference