我正在尝试使用 modeshape 进行全文搜索。我对基于 lucene 指数的排名结果特别感兴趣。这是我的存储库配置
"indexProviders": {
"lucene": {
"classname": "lucene",
"directory": "${user.home}/repository/indexes"
}
},
"indexes": {
"textFromFiles": {
"kind": "text",
"provider": "lucene",
"nodeType": "nt:resource",
"columns": "jcr:data(BINARY)"
}
},
我注意到在指定位置创建了一个 lucene 索引。我在存储库中添加了 10-15 个搜索词出现次数不同的文件,并尝试使用一些词进行搜索。我正在打印分数,如下所示
QueryManager querymgr = session.getWorkspace().getQueryManager();
String query = "SELECT file.* FROM [nt:hierarchyNode] as file LEFT JOIN [nt:resource] as data ON ISCHILDNODE(data , file) WHERE "
+ "contains(data.*, '" + searchText + "')";
Query createQuery = querymgr.createQuery(query, Query.JCR_SQL2);
QueryResult result = createQuery.execute();
RowIterator rows = result.getRows();
while(rows.hasNext()){
Row nextRow = rows.nextRow();
LOGGER.info("score : {}", nextRow.getScore());
}
但是,对于所有结果,这里的分数始终为 1.0。还尝试了一个没有连接的更简单的查询......
SELECT data.* FROM [nt:resource] as data WHERE contains(data.*, 'searchterm')
但没有运气