How to group and select document corresponding to

2019-07-11 02:50发布

问题:

Here is my mongo collection 'sales':

{"title":"Foo", "hash": 17, "num_sold": 49, 
"place": "ABC"}

{"title":"Bar", "hash": 18, "num_sold": 55, 
"place": "CDF"}

{"title":"Baz", "hash": 17, "num_sold": 55,
"place": "JKN"}

{"title":"Spam", "hash": 17, "num_sold": 20,
"place": "ZSD"}

{"title":"Eggs", "hash": 18, "num_sold": 20, 
"place": "ZDF"}

I would like to group by hash and return document with the greatest "num_sold". So as output I would like to see:

{"title":"Baz", "hash": 17, "num_sold": 55,
    "place": "JKN"}

 {"title":"Bar", "hash": 18, "num_sold": 55, 
    "place": "CDF"}

I know basic of aggregate operator and here is how I would group and get maximum of num_sold, but I need whole document corresponding to maximum, not just the value.

db.getCollection('sales').aggregate([
{$group: {_id: "$hash", max_sold : {$max: '$value'}}}
])

In SQL I would have done it with join, but in mongo. I also read that in mongo group and sort do not work well together.

回答1:

You can use the $redact stage to accomplish this. It avoids the usage of $sort and then again doing a $group or an $unwind.

  • $group by _id and get the maximum max_num_sold for each group, accumulate all the documents in the group using the $push operator.
  • $redact into sub documents per group, keeping only those which have the maximum max_num_sold in their num_sold

sample code:

db.getCollection('sales').aggregate([
{$group:{"_id":"$hash",
         "max_num_sold":{$max:"$num_sold"},
         "records":{$push:"$$ROOT"}}},
{$redact:{$cond:[{$eq:[{$ifNull:["$num_sold","$$ROOT.max_num_sold"]},
                       "$$ROOT.max_num_sold"]},
                "$$DESCEND","$$PRUNE"]}},
])

test data:

db.getCollection('sales').insert([
{"title":"Foo","hash":17,"num_sold":49,"place":"ABC"},
{"title":"Bar","hash":18,"num_sold":55,"place":"CDF"},
{"title":"Baz","hash":17,"num_sold":55,"place":"JKN"},
{"title":"Spam","hash":17,"num_sold":20,"place":"ZSD"},
{"title":"Eggs","hash":18,"num_sold":20,"place":"ZDF"}
])

test result:

{
        "_id" : 18,
        "max_num_sold" : 55,
        "records" : [
                {
                        "_id" : ObjectId("567874f2b506fc2193a22696"),
                        "title" : "Bar",
                        "hash" : 18,
                        "num_sold" : 55,
                        "place" : "CDF"
                }
        ]
}
{
        "_id" : 17,
        "max_num_sold" : 55,
        "records" : [
                {
                        "_id" : ObjectId("567874f2b506fc2193a22697"),
                        "title" : "Baz",
                        "hash" : 17,
                        "num_sold" : 55,
                        "place" : "JKN"
                }
        ]
}


回答2:

It looks like grouping in mongodb does not distort the order and something like this is possible:

mongodb, how to aggregate with group by and sort correctly.

In particular, for the above example we can get the following:

db.getCollection('sales').aggregate([
{$sort: {"num_sold":-1}},
{$group:{"_id": "$hash",
         "max_num_sold" : {$first:"$num_sold"},
         "title":{$first: "$title"},
         "place":{$first:"$place"}
         }}
])

Here is output:

{
    "result" : [ 
        {
            "_id" : 17.0000000000000000,
            "max_num_sold" : 55.0000000000000000,
            "title" : "Baz",
            "place" : "JKN"
        }, 
        {
            "_id" : 18.0000000000000000,
            "max_num_sold" : 55.0000000000000000,
            "title" : "Bar",
            "place" : "CDF"
        }
    ],
    "ok" : 1.0000000000000000
}