2

我正在升级到 Solr 4.1,并且无法使用新 API 检索位置和偏移信息。我的索引包含一个文档,其中一个字段包含字符串“一只快速棕色狐狸跳过一只懒狗”。我正在查询我的索引以查找“一”并尝试检索与“一”相对应的位置和偏移量。

这是代码片段

Terms terms=reader.getTermVector(docId, fieldName);
TermsEnum termsEnum= terms.iterator(TermsEnum.EMPTY);
    BytesRef term;
    while((term=termsEnum.next())!=null){
        String docTerm = term.utf8ToString();
        DocsAndPositionsEnum docPosEnum = termsEnum.docsAndPositions(null, null, DocsAndPositionsEnum.FLAG_OFFSETS);
        //Check if the current term is the same as the query term and if so
        //retrieve all positions (can be multiple occurrences of a term in a field) corresponding to the term
        if (queryTerms.contains(docTerm)) {
            int position;
            while((position=docPosEnum.nextPosition())!=-1){
                int start=docPosEnum.startOffset();
                int end=docPosEnum.endOffset();
                //Store start, end and position in an a list
                }
        }
    }

内部 while 循环不正确。任何关于如何遍历 DocsAndPositionsEnum 中所有位置的指针将不胜感激。

4

2 回答 2

8

这对我有用

Terms terms=reader.getTermVector(docId, fieldName);
TermsEnum termsEnum= terms.iterator(TermsEnum.EMPTY);
BytesRef term;
while((term=termsEnum.next())!=null){
            String docTerm = term.utf8ToString();
            //Check if the current term is the same as the query term and if so
            //retrieve all positions (can be multiple occurrences of a term in a field) corresponding to the term
            if (queryTerms.contains(docTerm)) {
                DocsAndPositionsEnum docPosEnum = termsEnum.docsAndPositions(null, null, DocsAndPositionsEnum.FLAG_OFFSETS);
                docPosEnum.nextDoc();
                //Retrieve the term frequency in the current document
                int freq=docPosEnum.freq();
                for(int i=0; i<freq; i++){
                    int position=docPosEnum.nextPosition();
                    int start=docPosEnum.startOffset();
                    int end=docPosEnum.endOffset();
                    //Store start, end and position in a list
                    }
            }
    }
于 2013-03-13T14:23:18.870 回答
1

你没有迭代到Document你的DocsAndPositionsEnum.

    if (queryTerms.contains(docTerm)) {
        docPosEnum.advance(docId)
        int freq=docPosEnum.freq();
        for(int i=0; i<freq; i++){
            int position=docPosEnum.nextPosition();
            int end=docPosEnum.endOffset();
            //Store start, end and position in an a list
        }
    }

docPosEnum.nextDoc()我猜你可能想存储从返回的docid 。

于 2013-03-12T20:57:40.297 回答