2

Zend_Search_Lucene我使用以下代码进行索引时,我已更改默认分析器以搜索数值。

public function executeIndexIT() {

   $path = '/home/project/mgh/lib/';
   set_include_path(get_include_path() . PATH_SEPARATOR . $path);       
   require_once '/home/project/mgh/lib/Zend/Search/Lucene.php';

   Zend_Search_Lucene_Analysis_Analyzer::setDefault(new Zend_Search_Lucene_Analysis_Analyzer_Common_TextNum_CaseInsensitive());

   $index = new Zend_Search_Lucene('/home/project/mgh/data/search_file/lucene.customer.index',true);

   $filenames1='/home/project/mgh/web/cvcollection/data8/ASBABranches10546.pdf';
   $filenames2='/home/project/mgh/web/cvcollection/data2/manoj_new10550.pdf';

   $fc1=htmlentities("'".$this->ConvertPDF($filenames1)."'");       
   $fc2=htmlentities("'".$this->ConvertPDF($filenames2)."'");

   $doc = new Zend_Search_Lucene_Document();
   $doc->addField(Zend_Search_Lucene_Field::unIndexed('URL', $filenames1));
   $doc->addField(Zend_Search_Lucene_Field::text('contents',$fc1));     
   $index->addDocument($doc);

   $doc = new Zend_Search_Lucene_Document();
   $doc->addField(Zend_Search_Lucene_Field::unIndexed('URL', $filenames2));
   $doc->addField(Zend_Search_Lucene_Field::text('contents',$fc2));     
   $index->addDocument($doc);

   $index->commit();
   exit;
}

在为搜索建立索引后,我正在使用以下代码:

public function executeSearchLucene() {

    $path = '/home/project/mgh/lib/';
    set_include_path(get_include_path() . PATH_SEPARATOR . $path);
    require_once('Zend/Search/Lucene.php');

    Zend_Search_Lucene_Analysis_Analyzer::setDefault(new Zend_Search_Lucene_Analysis_Analyzer_Common_TextNum_CaseInsensitive());

    $hits = array();
    $txtSearch='@';
    try {
        $query = Zend_Search_Lucene_Search_QueryParser::parse($txtSearch);
    } catch (Zend_Search_Lucene_Search_QueryParserException $e) {
        echo "Query syntax error: " . $e->getMessage() . "\n";
    }

    $index = new Zend_Search_Lucene('/home/project/mgh/data/search_file/lucene.customer.index');

    //**added on 29 may**/      
    $results = $index->find($query);
    echo count($results);
    foreach ( $results as $result ) {
        echo "<pre>";
        var_dump($result->URL); 
   }
   exit;
}

这里$fc2包含几个电子邮件地址,我需要搜索它们。但我得到 0 次点击。

如何搜索喜欢@!使用的字符Zend_Search_Lucene

4

1 回答 1

0

它仅适用于keyword未标记化的字段。因此,您需要确保将电子邮件(或其他带有特殊字符的文本)作为单独的数据提供,例如示例。您也不能使用查询解析器,因为查询解析器会将其转换为Zend_Search_Lucene_Search_Query_Preprocessing_Term对象:

echo('<pre>');
var_dump(Zend_Search_Lucene_Search_QueryParser::parse("*@*"));
var_dump(Zend_Search_Lucene_Search_QueryParser::parse("@"));
echo('</pre>');
die();

根据文档:

实际上并不参与查询执行

所以工作代码如下:

$index = Zend_Search_Lucene::create('/tmp/index');

$doc1 = new Zend_Search_Lucene_Document;
$doc1->addField(Zend_Search_Lucene_Field::text('title', 'Some Title Here'))
    ->addField(Zend_Search_Lucene_Field::keyword('content', 'test@test.com'));
$index->addDocument($doc1);

$doc2 = new Zend_Search_Lucene_Document;
$doc2->addField(Zend_Search_Lucene_Field::text('title', 'Another title Here'))
    ->addField(Zend_Search_Lucene_Field::keyword('content', 'test!test.com'));
$index->addDocument($doc2);

$index->commit();

Zend_Search_Lucene_Search_Query_Wildcard::setMinPrefixLength(0);
$term  = new Zend_Search_Lucene_Index_Term("*@*");
$query = new Zend_Search_Lucene_Search_Query_Wildcard($term);

$hits = $index->find($query);
echo('<pre>');
var_dump(count($hits));
foreach($hits as $hit) {
    var_dump($hit->title);
    var_dump($hit->content);
}
echo('</pre>');

Zend_Search_Lucene_Search_Query_Wildcard::setMinPrefixLength(0);
$term  = new Zend_Search_Lucene_Index_Term("*!*");
$query = new Zend_Search_Lucene_Search_Query_Wildcard($term);

$hits = $index->find($query);
echo('<pre>');
var_dump(count($hits));
foreach($hits as $hit) {
    var_dump($hit->title);
    var_dump($hit->content);
}
echo('</pre>');

die();

希望现在很清楚。Zend Lucene 实现有很多限制。

于 2013-05-07T08:02:50.747 回答