我有一个有效的 cURL 请求来使用聚合查询搜索弹性搜索索引根据需要,响应包括指定聚合字段的值列表以及与这些值中的每一个匹配的文档计数。例如,我按邮政编码汇总联系人,响应包括 50 个邮政编码以及每个邮政编码中的联系人数量。伟大的。
现在我还编写了一个执行相同聚合查询的 JAVA 函数。如何解析嵌套在聚合响应中的数据?特别是,我想提取每个桶的 key 和 docCount 变量。我在网上和 Elastic 文档中找不到这样的示例。
这是我到目前为止...
@GET
@Path("{indexName}")
public void searchResults(@PathParam("indexName") String indexName) throws IOException {
RestHighLevelClient client = createHighLevelRestClient();
int numberOfSearchHitsToReturn = 100; // defaults to 10
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
sourceBuilder.size(numberOfSearchHitsToReturn);
GlobalAggregationBuilder aggregation = AggregationBuilders.global("agg")
.subAggregation(AggregationBuilders.terms("home_zip_aggregation").field("home_zip.keyword"));
sourceBuilder.aggregation(aggregation);
SearchRequest searchRequest = new SearchRequest(indexName).source(sourceBuilder);
SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);
Aggregations aggregations = searchResponse.getAggregations();
Terms byZipAggregation = aggregations.get("home_zip");
System.out.print(byZipAggregation);
System.out.print(searchResponse);
client.close();
}
searchResponse 确实包含聚合列表。但是,byZipAggregation 为空。如何获取 home_zip 聚合数据作为对象?我正在使用此 Elastic 文档...
https://www.elastic.co/guide/en/elasticsearch/client/java-api/current/_bucket_aggregations.html
这是 searchResponse 的值:
{"took":7,"timed_out":false,"_shards":{"total":1,"successful":1,"skipped":0,"failed":0},"hits":{"total":{"value":51,"relation":"eq"},"max_score":1.0,"hits":[{"_index":"contacts_6_cjluhmdki6","_type":"_doc","_id":"2093","_score":1.0,"_source":{"list_id":"6","contact_id":"2093","firstname":"DANIEL","middlename":"C","lastname":"BRYANT","email":"","home_address1":"602 STONE CIRCLE CT APT 2","home_city":"SCHAUMBURG","home_state":"IL","home_zip":"60194","home_phone":"","latitude":"42.030346","longitude":"-88.06422","location_point":"0101000020E6100000F2EF332E1C0456C03FC8B260E2034540","date_of_birth":"10/26/1991","sex":"M","registered_party":"0","created":"2019-11-13 21:24:55.825672","imported":"2019-11-13 15:24:51.006805","fulltext":"'2':8 '60194':10 '602':3 'apt':7 'bryant':2 'circle':5 'ct':6 'daniel':1 'schaumburg':9 'stone':4","home_house_num":"602","home_street_name":"STONE CIRCLE","home_street_type":"CT","home_unit_num":"APT 2","fake_col":"0.414"} ...
更多文档数据在这里。我将其删除以简化此示例。
}}]},"aggregations":{"global#agg":{"doc_count":51,"sterms#home_zip_aggregation":{"doc_count_error_upper_bound":0,"sum_other_doc_count":38,"buckets":[{"key":"60462","doc_count":2},{"key":"60506","doc_count":2},{"key":"60005","doc_count":1},{"key":"60030","doc_count":1},{"key":"60061","doc_count":1},{"key":"60098","doc_count":1},{"key":"60102","doc_count":1},{"key":"60126","doc_count":1},{"key":"60137","doc_count":1},{"key":"60187","doc_count":1}]}}}}
我注意到可以将整个 Aggregations 对象传递给我们的客户端代码,该代码是用 Javascript 编写的,然后在 Javascript 代码中解析出所需的字段。但是,我们希望在 Java 服务器代码中完成所有这些解析,这样我们就不会将不必要的数据传递给客户端。此外,对于我们的服务器来说,有些响应似乎太大而无法传递给客户端。那么,如何在 Java 中解析出存储桶键和 docCounts?