lucene - 索引字段时的 Lucene 4.2 分析器

Question

我正在尝试使用 Lucene 4.2 索引一组文档。我创建了一个自定义分析器，它不会标记并且不会小写术语，使用以下代码：

     public class NoTokenAnalyzer extends Analyzer{
public Version matchVersion;
public NoTokenAnalyzer(Version matchVersion){
    this.matchVersion=matchVersion;
}
@Override
protected TokenStreamComponents createComponents(String fieldName, Reader reader) {
    // TODO Auto-generated method stub
    //final Tokenizer source = new NoTokenTokenizer(matchVersion, reader);
    final KeywordTokenizer source=new KeywordTokenizer(reader);
    TokenStream result = new LowerCaseFilter(matchVersion, source);
    return new TokenStreamComponents(source, result);

}

}

我使用分析器来构建索引（灵感来自 Lucene 文档中提供的代码）：

    public static void IndexFile(Analyzer analyzer) throws IOException{
    boolean create=true;



String directoryPath="path";
File folderToIndex=new File(directoryPath);
File[]filesToIndex=folderToIndex.listFiles();

Directory directory=FSDirectory.open(new File("index path"));

IndexWriterConfig iwc = new IndexWriterConfig(Version.LUCENE_42, analyzer);

      if (create) {
        // Create a new index in the directory, removing any
        // previously indexed documents:
        iwc.setOpenMode(OpenMode.CREATE);
     } else {
        // Add new documents to an existing index:
        iwc.setOpenMode(OpenMode.CREATE_OR_APPEND);
      }

      IndexWriter writer = new IndexWriter(directory, iwc);
for (final File singleFile : filesToIndex) {


//process files in the directory and extract strings to index
    //..........
    String field1;
    String field2;

     //index fields

      Document doc=new Document();


     Field f1Field= new Field("f1", field1, TextField.TYPE_STORED);


      doc.add(f1Field);
      doc.add(new Field("f2", field2, TextField.TYPE_STORED));  
      }
writer.close();
   }

代码的问题是索引字段没有标记化，但它们也没有小写，即，似乎在索引期间没有应用分析器。我不知道出了什么问题？如何使分析仪工作？

score 1 · Accepted Answer

代码正常工作。因此，它可能会为某人在 Lucene 4.2 中创建自定义分析器，并将其用于索引和搜索。

lucene - 索引字段时的 Lucene 4.2 分析器

1 回答 1

Related

Reference