我有如下数据:
{
"_index": "user_log",
"_type": "logs",
"_id": "gdUJpXIBAoADuwvHTK29",
"_score": 1,
"_source": {
"user_id": 105,
"user_name": "prathameshsalap@gmail.com",
"working_hours": "2019-10-21 09:00:01",
"date": "2019-10-21",
"working_minutes": 540
}
{
"_index": "user_log",
"_type": "logs",
"_id": "gtUJpXIBAoADuwvHTK29",
"_version": 1,
"_score": 0,
"_source": {
"user_id": 106,
"user_name": "vaishusawant143@gmail.com",
"working_hours": "2019-10-21 09:15:01",
"date": "2019-10-21",
"working_minutes": 555
}
在这里,我有多个字段,例如 user_id、user_name、working_hours、date、working_minutes。我只想选择 user_id、user_name 和 avg_hours(在为每个用户计算 avg_working_minutes 之后)。
body = {
"query" : {"match_all": {}},
"aggs": {
"users": {
"terms": {
"field": "user_name.keyword",
"order": {
"avg_hours": "desc"
}
},
"aggs": {
"avg_hours": {
"avg": {
"field": "working_minutes"
}
}
}
}
}
}
es_obj = Elasticsearch()
response = els_obj.search(index='user_log', body)
# save into csv
with open(path, 'w') as files:
header_present = False
for doc in response['hits']['hits']:
my_dict = doc['_source']
if not header_present:
w = csv.DictWriter(files, my_dict.keys())
w.writeheader()
header_present = True
w.writerow(my_dict)
它的返回输出如下:
user_id | date | user_name | working_hours, | working_minutes
---------|----------|-------------------------|-------------------|----------
105 |2019-10-21|prathameshsalap@gmail.com|2019-10-21 09:00:01| 540
106 |2019-10-21|vaishusawant143@gmail.com|2019-10-21 09:15:01| 555
这里它返回所有 user_id、user_name、working_hours、date、working_minutes。我不想要所有这些领域。那么,如何在此查询中选择多个字段(user_id、user_name 和 avg_hours)?
预期输出:
user_id | user_name | Avg_hour
---------|-------------------------|----------
105 |prathameshsalap@gmail.com| 450.55
106 |vaishusawant143@gmail.com| 350