Mongo aggregate nested array

2019-04-02 17:11发布

问题:

I have a mongo collection with following structure

{
    "userId" : ObjectId("XXX"),
    "itemId" : ObjectId("YYY"),
    "resourceId" : 1,
    "_id" : ObjectId("528455229486ca3606004ec9"),
    "parameter" : [
        {
            "name" : "name1",
            "value" : 150,
            "_id" : ObjectId("528455359486ca3606004eed")
        },
        {
            "name" : "name2",
            "value" : 0,
            "_id" : ObjectId("528455359486ca3606004eec")
        },
        {
            "name" : "name3",
            "value" : 2,
            "_id" : ObjectId("528455359486ca3606004eeb")
        }
    ]
}

There can be multiple documents with the same 'useId' with different 'itemId' but the parameter will have same key/value pairs in all of them.

What I am trying to accomplish is return aggregated parameters "name1", "name2" and "name3" for each unique "userId" disregard the 'itemId'. so final results would look like for each user :

{
  "userId" : ObjectId("use1ID"),
  "name1" : (aggregatedValue),
  "name2" : (aggregatedValue),
  "name3" : (aggregatedVAlue) 
},
{
  "userId" : ObjectId("use2ID"),
  "name1" : (aggregatedValue),
  "name2" : (aggregatedValue),
  "name3" : (aggregatedVAlue) 
}

Is it possible to accomplish this using the aggregated methods of mongoDB ? Could you please help me to build the proper query to accomplish that ?

回答1:

The simplest form of this is to keep things keyed by the "parameter" "name":

db.collection.aggregate(
   // Unwind the array
   { "$unwind": "$parameter"},

   // Group on the "_id" and "name" and $sum "value"
   { "$group": {
       "_id": { 
           "userId": "$userId",
           "name": "$parameter.name"
       }, 
       "value": { "$sum": "$parameter.value" } 
   }},

   // Put things into an array for "nice" processing
   { "$group": {
       "_id": "$_id.userId",
       "values": { "$push": { 
           "name": "$_id.name",
           "value": "$value"
       }}
   }}
)

If you really need to have the "values" of names as the field values, you can do the the following. But since you are "projecting" the fields/properties then you must specify them all in your code. You cannot be "dynamic" anymore and you are coding/generating each one:

db.collection.aggregate([
    // Unwind the array
    { "$unwind": "$parameter"},

   // Group on the "_id" and "name" and $sum "value"
    { "$group": {
        "_id": {
             "userId": "$userId",
             "name": "$parameter.name"
        },
        "value": { "$sum": "$parameter.value"}
    }},

    // Project out discrete "field" names with $cond
    { "$project": {
        "name1": { "$cond": [
            { "$eq": [ "$_id.name", "name1" ] },
            "$value",
            0
        ]},
        "name2": { "$cond": [
            { "$eq": [ "$_id.name", "name2" ] },
            "$value",
            0
        ]},
        "name3": { "$cond": [
            { "$eq": [ "$_id.name", "name3" ] },
            "$value",
            0
        ]},
    }},

    // The $cond put "0" values in there. So clean up with $group and $sum
    { "$group": { 
        _id: "$_id.userId",
        "name1": { "$sum": "$name1" },
        "name2": { "$sum": "$name2" },
        "name3": { "$sum": "$name3" }
    }}

])

So while the extra steps give you the result that you want ( well with a final project to change the _id to userId ), for my mind the short version is workable enough, unless you really do need it. Consider the output from there as well:

{
    "_id" : ObjectId("53245016ea402b31d77b0372"),
    "values" : [
        {
            "name" : "name3",
            "value" : 2
        },
        {
            "name" : "name2",
            "value" : 0
        },
        {
            "name" : "name1",
            "value" : 150
        }
    ]
}

So that would be what I would use, personally. But your choice.



回答2:

Not sure if I got your question but if the name field can contain only "name1", "name2", "name3" or at least you are only interested in this values, one of the possible queries could be this one:

db.aggTest.aggregate(
    {$unwind:"$parameter"}, 
    {$project: {"userId":1, "parameter.name":1, 
       "name1" : {"$cond": [{$eq : ["$parameter.name", "name1"]}, "$parameter.value", 0]},
       "name2" : {"$cond": [{$eq : ["$parameter.name", "name2"]}, "$parameter.value", 0]}, 
       "name3" : {"$cond": [{$eq : ["$parameter.name", "name3"]}, "$parameter.value", 0]}}}, 
    {$group : {_id : {userId:"$userId"}, 
       name1 : {$sum:"$name1"}, 
       name2 : {$sum:"$name2"}, 
       name3 : {$sum:"$name3"}}})

It firsts unwinds the parameter array, then separates name1, name2 and name3 values into different columns. There's a simple conditional statement for that. After that we can easily aggreagate by the new columns.

Hope it helps!