0

我们在 Lucene 4.0.0 中遇到了与时间范围搜索/过滤器相关的问题。我们已经索引了一些推文,现在我们要收集特定用户在特定时间范围内发送的推文。当我们使用创建的过滤器运行相关查询时,我们会获得超出指定时间范围的推文。例如,在下面的示例中,我们没有预料到会有“exp tweet”,因为它的 timeStamp 小于 lowerBound。

您能否就如何执行此任务给我们任何建议,或者我们的代码中存在哪些问题?

问候

相关代码

// time range, format "yyyyMMddHHmmss"
String upperBoundStr = 20110126024422;
String lowerBoundStr = 20110126021422;
String tweetTimeStr = 20110126022922;

//create filter
Filter lowerFilter = new QueryWrapperFilter( TermRangeQuery.newStringRange("creationTime",lowerBoundStr,tweetTimeStr,true,false));      
Filter upperFilter = new QueryWrapperFilter( TermRangeQuery.newStringRange("creationTime",tweetTimeStr,upperBoundStr,false,true));
Filter[] filters = new Filter[2];
filters[0] = lowerFilter;
filters[1] = upperFilter;
Filter chainFilter = new ChainedFilter(filters, ChainedFilter.OR);

// search
Query luceneQuery = new TermQuery(new Term("username", "userName1"));
SimpleFSDirectory index = new SimpleFSDirectory(new File("lucene_index"));
IndexReader reader = DirectoryReader.open(index);
IndexSearcher searcher = new IndexSearcher(reader);
ScoreDoc[] hits = searchFilteredQuery(luceneQuery, searcher,chainFilter,maxNumberOfNewTweets);
List<RankResult> filteredtweets = convertHitsToRankResults(hits, searcher);

示例输出(格式:date dateIn("yyyyMMddHHmmss") userName)

base tweet: Wed Jan 26 02:29:22 VET 2011 20110126022922 userName1
exp tweet: Tue Jan 25 20:05:02 VET 2011 20110125200502 userName1
4

1 回答 1

0

您只想获取介于upperBoundStr和之间的时间戳的推文lowerBoundStr?如果是这样,您应该更改Filter chainFilter = new ChainedFilter(filters, ChainedFilter.OR);Filter chainFilter = new ChainedFilter(filters, ChainedFilter.AND);. 因为OR意味着大于的lowerBoundStr时间戳和小于的时间戳upperBoundStr都会被放入搜索结果中。

于 2013-09-05T03:21:35.890 回答