I need to do get some data grouped by day of week and hour, for example
curl -XGET http://localhost:9200/testing/hello/_search?pretty=true -d '
{
"size": 0,
"aggs": {
"articles_over_time" : {
"date_histogram" : {
"field" : "date",
"interval" : "hour",
"format": "E - k"
}
}
}
}
'
Gives me this:
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 2857,
"max_score" : 0.0,
"hits" : [ ]
},
"aggregations" : {
"articles_over_time" : {
"buckets" : [ {
"key_as_string" : "Fri - 17",
"key" : 1391792400000,
"doc_count" : 6
},
...
{
"key_as_string" : "Wed - 22",
"key" : 1411596000000,
"doc_count" : 1
}, {
"key_as_string" : "Wed - 22",
"key" : 1411632000000,
"doc_count" : 1
} ]
}
}
}
Now I need to summarize doc counts by this value "Wed - 22", how can I do this? Maybe some another approach?
This is because you are using an interval of 'hour', but, the date format is 'day' (E - k).
Change your interval to be 'day', and you'll no longer get separate buckets for 'Weds - 22'.
Or, if you do want per hour, then change your format to include the hour field.
You can try doing terms aggregation on "key_as_string" field from the aggregation results using sub aggregation.
Hope that helps.
The same kind of problem has been solved in this thread.
Adapting the solution to your problem, we need to make a script to convert the date into the hour of day and day of week:
And use it in a query: