4

I am trying to delete documents from Lucene Index. I want to delete only the specified file from lucene index .

My following program is deleting the index which can be searched using keyword analyzer but my required filename can be searched only using StandardAnalyzer . So is it any way to set standard analyzer in my term or instead of term how can i user QueryParser to delete the Documents from lucene index.

 try{
    File INDEX_DIR= new File("D:\\merge lucene\\abc\\");

    Directory directory = FSDirectory.open(INDEX_DIR);

     IndexReader indexReader = IndexReader.open(directory,false);
     Term term= new Term("path","fileindex23005.htm");
    int l=   indexReader.deleteDocuments(term);
                      indexReader.close();

    System.out.println("documents deleted");
  }
  catch(Exception x){x.printStackTrace();}
4

3 回答 3

11

我假设您使用的是 Lucene 3.6 或更早版本,否则IndexReader.deleteDocuments不再存在。但是,无论如何,您应该使用 IndexWriter。

如果您只能使用查询解析器找到文档,则只需运行普通查询,然后遍历返回的文档,并通过 docnum 删除它们,如下所示:

Query query = queryParser.parse("My Query!");
ScoreDoc[] docs = searcher.search(query, 100).scoreDocs;
For (ScoreDoc doc : docs) {
    indexReader.deleteDocument(doc.doc);
}

或者更好(更简单,使用未失效、未弃用的功能),只需使用 an IndexWriter,然后直接将查询传递给它:

Query query = queryParser.parse("My Query!");
writer.deleteDocuments(query);
于 2013-09-19T16:37:50.937 回答
1

为像我这样的人添加以供将来参考,删除文档在 indexWriter 上,您可以使用

indexWriter.deleteDocuments(术语...术语)

而不是使用 deleteDocuments(query) 方法;如果您只需要匹配一个字段,则可以减少麻烦。请注意,如果通过了多个术语,则此方法将术语视为 OR 条件。因此它将匹配任何术语并删除所有记录。下面的代码将匹配存储的文档中的 state=Tx 并将删除匹配的记录。

  indexWriter.deleteDocuments(
        new Term("STATE", "Tx")
      );

为了将不同的字段与 AND 条件组合,我们可以使用以下代码:

 BooleanQuery.Builder builder = new BooleanQuery.Builder();

//note year is stored as int , not as string when document is craeted.
//if you use Term here which will need 2016 as String, that will not match with documents stored with year as int.
 Query yearQuery = IntPoint.newExactQuery("year", 2016);
 Query stateQuery = new TermQuery(new Term("STATE", "TX"));
 Query cityQuery = new TermQuery(new Term("CITY", "CITY NAME"));

 builder.add(yearQuery, BooleanClause.Occur.MUST);
 builder.add(stateQuery, BooleanClause.Occur.MUST);
 builder.add(cityQuery, BooleanClause.Occur.MUST);

 indexWriter.deleteDocuments(builder.build());
于 2017-01-18T02:21:52.947 回答
0

正如@dillippattnaik 指出的那样,多个术语会导致OR。我已经更新了他的代码并使用BooleanQuery

BooleanQuery query = new BooleanQuery
{
   { new TermQuery( new Term( "year", "2016" ) ), Occur.MUST },
   { new TermQuery( new Term( "STATE", "TX" ) ), Occur.MUST },
   { new TermQuery( new Term( "CITY", "CITY NAME" ) ), Occur.MUST }
};

indexWriter.DeleteDocuments( query );
于 2018-08-31T17:30:16.440 回答