Here is my mongo collection 'sales':
{"title":"Foo", "hash": 17, "num_sold": 49,
"place": "ABC"}
{"title":"Bar", "hash": 18, "num_sold": 55,
"place": "CDF"}
{"title":"Baz", "hash": 17, "num_sold": 55,
"place": "JKN"}
{"title":"Spam", "hash": 17, "num_sold": 20,
"place": "ZSD"}
{"title":"Eggs", "hash": 18, "num_sold": 20,
"place": "ZDF"}
I would like to group by hash and return document with the greatest "num_sold". So as output I would like to see:
{"title":"Baz", "hash": 17, "num_sold": 55,
"place": "JKN"}
{"title":"Bar", "hash": 18, "num_sold": 55,
"place": "CDF"}
I know basic of aggregate operator and here is how I would group and get maximum of num_sold, but I need whole document corresponding to maximum, not just the value.
db.getCollection('sales').aggregate([
{$group: {_id: "$hash", max_sold : {$max: '$value'}}}
])
In SQL I would have done it with join, but in mongo. I also read that in mongo group and sort do not work well together.
You can use the $redact
stage to accomplish this. It avoids the usage of $sort
and then again doing a $group
or an $unwind
.
$group
by _id
and get the maximum max_num_sold
for each group, accumulate all the documents in the group using the $push
operator.
$redact
into sub documents per group, keeping only those which have the maximum max_num_sold
in their num_sold
sample code:
db.getCollection('sales').aggregate([
{$group:{"_id":"$hash",
"max_num_sold":{$max:"$num_sold"},
"records":{$push:"$$ROOT"}}},
{$redact:{$cond:[{$eq:[{$ifNull:["$num_sold","$$ROOT.max_num_sold"]},
"$$ROOT.max_num_sold"]},
"$$DESCEND","$$PRUNE"]}},
])
test data:
db.getCollection('sales').insert([
{"title":"Foo","hash":17,"num_sold":49,"place":"ABC"},
{"title":"Bar","hash":18,"num_sold":55,"place":"CDF"},
{"title":"Baz","hash":17,"num_sold":55,"place":"JKN"},
{"title":"Spam","hash":17,"num_sold":20,"place":"ZSD"},
{"title":"Eggs","hash":18,"num_sold":20,"place":"ZDF"}
])
test result:
{
"_id" : 18,
"max_num_sold" : 55,
"records" : [
{
"_id" : ObjectId("567874f2b506fc2193a22696"),
"title" : "Bar",
"hash" : 18,
"num_sold" : 55,
"place" : "CDF"
}
]
}
{
"_id" : 17,
"max_num_sold" : 55,
"records" : [
{
"_id" : ObjectId("567874f2b506fc2193a22697"),
"title" : "Baz",
"hash" : 17,
"num_sold" : 55,
"place" : "JKN"
}
]
}
It looks like grouping in mongodb does not distort the order and something like this is possible:
mongodb, how to aggregate with group by and sort correctly.
In particular, for the above example we can get the following:
db.getCollection('sales').aggregate([
{$sort: {"num_sold":-1}},
{$group:{"_id": "$hash",
"max_num_sold" : {$first:"$num_sold"},
"title":{$first: "$title"},
"place":{$first:"$place"}
}}
])
Here is output:
{
"result" : [
{
"_id" : 17.0000000000000000,
"max_num_sold" : 55.0000000000000000,
"title" : "Baz",
"place" : "JKN"
},
{
"_id" : 18.0000000000000000,
"max_num_sold" : 55.0000000000000000,
"title" : "Bar",
"place" : "CDF"
}
],
"ok" : 1.0000000000000000
}