在 Solr 4.10 中,我在 11 个分片核心中有 170.000.000 个文档。自 2008 年以来,每个文档都代表我网站中的一次访问,11 个核心中的每一个代表一年。
我需要找到一个项目列表的访问权限,所以像下面这样进行查询:
using facet.field, "QTime": 10557
(通过核心重新加载清理缓存后)
q=(owningItem:178350+OR+owningItem:51760+OR+owningItem:71585)+AND+statistics_type:view&shards=localhost:8080/solr//statistics-2014,localhost:8080/solr//statistics-2017,localhost: 8080/solr//statistics-2016,localhost:8080/solr//statistics-2008,localhost:8080/solr//statistics-2011,localhost:8080/solr//statistics-2012,localhost:8080/solr//statistics -2010,localhost:8080/solr//statistics-2013,localhost:8080/solr//statistics-2009,localhost:8080/solr//statistics-2015,localhost:8080/solr//statistics&facet.limit=4&facet.field =owningItem&facet.mincount=1
结果:
"facet_counts": {
"facet_queries": {},
"facet_fields": {
"owningItem": [
"51760",
3502,
"71585",
1860
]
},
"facet_dates": {},
"facet_ranges": {},
"facet_intervals": {}
},
当我调试这个查询时,我可以看到,对于每个核心,返回的 facet.field 值不属于查询结果:
response={numFound=953,start=0,maxScore=1.9732983,docs=[]},sort_values={},facet_counts={facet_queries={},facet_fields={owningItem={51760=556,71585=397,**1=0,10=0,100=0,1000=0,10000=0,100000=0,100001=0,100002=0,100003=0,100004=0,100005=0,100007=0,100008=0,10001=0**}},facet_dates={},facet_ranges={},facet_intervals={}}
所以,我尝试使用 facet.query 而不是 facet.field
using facet.query, "QTime": 1346
q=(owningItem:178350+OR+owningItem:51760+OR+owningItem:71585)+AND+statistics_type:view&shards=localhost:8080/solr//statistics-2014,localhost:8080/solr//statistics-2017,localhost:8080/solr//statistics-2016,localhost:8080/solr//statistics-2008,localhost:8080/solr//statistics-2011,localhost:8080/solr//statistics-2012,localhost:8080/solr//statistics-2010,localhost:8080/solr//statistics-2013,localhost:8080/solr//statistics-2009,localhost:8080/solr//statistics-2015,localhost:8080/solr//statistics&facet.limit=4&facet.query=owningItem:178350&facet.query=owningItem:51760&facet.query=owningItem:71585&facet.mincount=1
"facet_counts": {
"facet_queries": {
"owningItem:178350": 0,
"owningItem:51760": 3502,
"owningItem:71585": 1860
},
"facet_fields": {},
"facet_dates": {},
"facet_ranges": {},
"facet_intervals": {}
},
并调试,仅使用属于结果的项目:
response={numFound=953,start=0,maxScore=1.9732983,docs=[]},sort_values={},facet_counts={facet_queries={owningItem:178350=0,owningItem:51760=556,owningItem:71585=397},facet_fields={},facet_dates={},facet_ranges={},facet_intervals={}}
我得出的结论是 facet.field 的计算超过了 Solr 查询的结果。不过我觉得这个结论是不写的。
我的问题:
为什么 facet.query 比 facet.field 快?
Solr 真的是对不属于查询结果的文档计算 facet.field 吗?