我在 ElasticSearch 中有数百万条记录。今天,我意识到有一些记录重复。有没有办法删除这些重复的记录?
这是我的查询。
{
"query": {
"filtered":{
"query" : {
"bool": {"must":[
{"match": { "sensorId": "14FA084408" }},
{"match": { "variableName": "FORWARD_FLOW" }}
]
}
},
"filter": {
"range": { "timestamp": { "gt" : "2015-07-04",
"lt" : "2015-07-06" }}
}
}
}
}
这就是我从中得到的。
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 21,
"max_score": 8.272615,
"hits": [
{
"_index": "iotsens-summarizedmeasures",
"_type": "summarizedmeasure",
"_id": "AU5isxVcMpd7AZtvmZcK",
"_score": 8.272615,
"_source": {
"id": null,
"sensorId": "14FA084408",
"variableName": "FORWARD_FLOW",
"rawValue": "0.2",
"value": "0.2",
"timestamp": 1436047200000,
"summaryTimeUnit": "DAYS"
}
},
{
"_index": "iotsens-summarizedmeasures",
"_type": "summarizedmeasure",
"_id": "AU5isxVnMpd7AZtvmZcL",
"_score": 8.272615,
"_source": {
"id": null,
"sensorId": "14FA084408",
"variableName": "FORWARD_FLOW",
"rawValue": "0.2",
"value": "0.2",
"timestamp": 1436047200000,
"summaryTimeUnit": "DAYS"
}
},
{
"_index": "iotsens-summarizedmeasures",
"_type": "summarizedmeasure",
"_id": "AU5isxV6Mpd7AZtvmZcN",
"_score": 8.0957,
"_source": {
"id": null,
"sensorId": "14FA084408",
"variableName": "FORWARD_FLOW",
"rawValue": "0.2",
"value": "0.2",
"timestamp": 1436047200000,
"summaryTimeUnit": "DAYS"
}
},
{
"_index": "iotsens-summarizedmeasures",
"_type": "summarizedmeasure",
"_id": "AU5isxWOMpd7AZtvmZcP",
"_score": 8.0957,
"_source": {
"id": null,
"sensorId": "14FA084408",
"variableName": "FORWARD_FLOW",
"rawValue": "0.2",
"value": "0.2",
"timestamp": 1436047200000,
"summaryTimeUnit": "DAYS"
}
},
{
"_index": "iotsens-summarizedmeasures",
"_type": "summarizedmeasure",
"_id": "AU5isxW8Mpd7AZtvmZcT",
"_score": 8.0957,
"_source": {
"id": null,
"sensorId": "14FA084408",
"variableName": "FORWARD_FLOW",
"rawValue": "0.2",
"value": "0.2",
"timestamp": 1436047200000,
"summaryTimeUnit": "DAYS"
}
},
{
"_index": "iotsens-summarizedmeasures",
"_type": "summarizedmeasure",
"_id": "AU5isxXFMpd7AZtvmZcU",
"_score": 8.0957,
"_source": {
"id": null,
"sensorId": "14FA084408",
"variableName": "FORWARD_FLOW",
"rawValue": "0.2",
"value": "0.2",
"timestamp": 1436047200000,
"summaryTimeUnit": "DAYS"
}
},
{
"_index": "iotsens-summarizedmeasures",
"_type": "summarizedmeasure",
"_id": "AU5isxXbMpd7AZtvmZcW",
"_score": 8.0957,
"_source": {
"id": null,
"sensorId": "14FA084408",
"variableName": "FORWARD_FLOW",
"rawValue": "0.2",
"value": "0.2",
"timestamp": 1436047200000,
"summaryTimeUnit": "DAYS"
}
},
{
"_index": "iotsens-summarizedmeasures",
"_type": "summarizedmeasure",
"_id": "AU5isxUtMpd7AZtvmZcG",
"_score": 8.077545,
"_source": {
"id": null,
"sensorId": "14FA084408",
"variableName": "FORWARD_FLOW",
"rawValue": "0.2",
"value": "0.2",
"timestamp": 1436047200000,
"summaryTimeUnit": "DAYS"
}
},
{
"_index": "iotsens-summarizedmeasures",
"_type": "summarizedmeasure",
"_id": "AU5isxXPMpd7AZtvmZcV",
"_score": 8.077545,
"_source": {
"id": null,
"sensorId": "14FA084408",
"variableName": "FORWARD_FLOW",
"rawValue": "0.2",
"value": "0.2",
"timestamp": 1436047200000,
"summaryTimeUnit": "DAYS"
}
},
{
"_index": "iotsens-summarizedmeasures",
"_type": "summarizedmeasure",
"_id": "AU5isxUZMpd7AZtvmZcE",
"_score": 7.9553676,
"_source": {
"id": null,
"sensorId": "14FA084408",
"variableName": "FORWARD_FLOW",
"rawValue": "0.2",
"value": "0.2",
"timestamp": 1436047200000,
"summaryTimeUnit": "DAYS"
}
}
]
}
}
如您所见,我在同一天有 21 条重复记录。我怎样才能删除重复的记录,每天只保留一个?谢谢。