0

我迫切需要帮助,我正在尝试在包含超过 5000 个 pdf 的文件夹目录中搜索文本字符串,该代码已经过测试并且可以使用少于 100 个 pdf 并且它可以工作,但是一旦达到限制它需要超过 5-10 分钟得出结果。任何帮助是极大的赞赏:

'<%
'Search Text
Dim strtextToSearch
strtextToSearch = Request("TextToSearch")

'Now, we want to search all of the files
Dim fso

'Constant to read
Const ForReading = 1
Set fso = Server.CreateObject("Scripting.FileSystemObject")

'Specify the folder path to search.
Dim FolderToSearch
FolderToSearch = "C:\inetpub\site\Files\allpdfs\"

'Proceed if folder exists
if fso.FolderExists(FolderToSearch) then

    Dim objFolder
    Set objFolder = fso.GetFolder(FolderToSearch)

    Dim objFile, objTextStream, strFileContents, bolFileFound
    bolFileFound = False

    Dim FilesCounter
    FilesCounter = 0 'Total files found

    For Each objFile in objFolder.Files
        Set objTextStream = fso.OpenTextFile(objFile.Path,ForReading)
        'Read the content
        strFileContents = objTextStream.ReadAll
        If InStr(1,strFileContents,strtextToSearch,1) then
        '%>
           <a href="http://go.to.mysite.com/files/allpdfs/<%Response.Write objFile.Name%>" target="_blank">
        '<%
           Response.Write objFile.Name & "</a><br>"
           FilesCounter = FilesCounter + 1
        End If
        objTextStream.Close
    Next

    if FilesCounter = 0 then
        Response.Write "Sorry, No matches found."
    else
        Response.Write "Total files found : " & FilesCounter
    end if

    'Destroy the objects
    Set objTextStream = Nothing
    Set objFolder = Nothing
else
    Response.Write "Sorry, invalid folder name"
end if
Set fso = Nothing
%>
4

1 回答 1

1

每次都进行完整搜索将花费很长时间。你最好使用像 Solr 这样的索引器来保持它的索引,就像搜索引擎一样,并快速返回结果。

这是一个很好的起点。http://wiki.apache.org/solr/

于 2015-10-29T16:35:48.897 回答