On our site we allow users to filter Elastic Search results with a set of filters divided into categories:
A
A1
A2
...
B
B1
B2
B3
...
These are just matching on literal tags that can appear in a field of the document, like:
{ tags: ["A1", "B1", "B2"] }
Our existing query joins all the filter terms under AND, so if the user selects A1
, B1
, and B2
, we filter by (A1 AND B1 AND B2)
.
We want to change this to "OR within each filter category", and "AND across categories", so that you'd get: (A1) AND (B1 OR B2)
.
Now, the wrinkle: we also use a "terms" aggregation on the "tags" field to predict how many items would come back from applying the next filter. On our UI this looks like:
A
A1 12 # If the user adds the A1 filter, there'll be 12 results.
A2 3 # etc.
...
B
B1 5
B2 0
B3 2
...
Here, changing the filter logic to AND/OR breaks the counts that come back from the "terms" aggregation, because the terms aggregation is still predicting A1 AND B1 AND B2
. Adding B3
would get us A1 AND B1 AND B2 AND B3
and thus narrow the counts from the aggregations, whereas it would actually widen the scope of the results (we'd want (A1) AND (B1 OR B2 OR B3)
).
Is there a way to express this in aggregations so that the filtering logic and the aggregation counts match?