I have a scenario in analytics where I want to calculate least performing 20 outlets out of 1000+ outlets where performance = transactionCount/ VisitCount
per month at an outlet.
Mappings are,
{
"CustomerVisit": {
"properties": {
"outlet": {
"type": "string",
"index": "not_analyzed"
},
"customerName": {
"type": "string",
"index": "not_analyzed"
},
"visitMonth": {
"type": "Date"
},
"visit": {
"type": "nested",
"properties": {
"visitStatus": {
"type": "long"
},
"transactionStatus": {
"type": "long"
},
"remarks": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
}
Now, What I want is aggregate in such a way that gives me 20 outlets out of 1000+ sorted by ratio of aggregation of visitStatus to aggregation of transactionStatus.
I couldn't find a way to solve it using aggs
, I don't know if script could be implemented here. The rough solution I was about to implement was
- aggregate by least transactionCount, get 20 buckets(with outlet names/counts) [one query executed]
aggregate by visitCount for above 20 outlets [another query]
divide count1 by count2 to get ratio
But this would never give me right result, because I'm sorting by least transactions not ratio of transactions/visits.