$sum from documents and subdocuments group by “$au

2019-05-23 06:23发布

问题:

This is my collection:

{
        "_id" : 10926400,
        "votes": 131,
        "author": "Jesse",
        "comments" : [
                {
                        "id" : 1,
                        "votes": 31,
                        "author": "Mirek"
                },
                {
                        "id": 2,
                        "votes": 13,
                        "author": "Leszke"
                }
        ]
},
{
        "_id" : 10926401,
        "votes": 75,
        "author": "Mirek",
        "comments" : [
                {
                        "id" : 1,
                        "votes": 17,
                        "author": "Jesse"
                },
                {
                        "id": 2,
                        "votes": 29,
                        "author": "Mirek"
                }
        ]
}

And I want $sum values of votes and comments.votes of each author

expected output(sort $votes: -1):

"Mirek" total votes: 31 + 75 + 29 = 135

"Jesse" total votes: 131 + 17 = 148

"Leszke total votes: 13

回答1:

Not immediately visible but possible. What you need to do here is combine your top level document with the array of comments without duplicating it. Here's an approach to first join the content as two arrays into a singular array, then $unwind to group the content:

db.collection.aggregate([
    { "$group": {
        "_id": "$_id",
        "author": { 
            "$addToSet": {
                "id": "$_id",
                "author": "$author",
                "votes": "$votes"
            }
        },
        "comments": { "$first": "$comments" }
    }},
    { "$project": {
        "combined": { "$setUnion": [ "$author", "$comments" ] }
    }},
    { "$unwind": "$combined" },
    { "$group": {
        "_id": "$combined.author",
        "votes": { "$sum": "$combined.votes" }
    }},
    { "$sort": { "votes": -1 } }
])

Which gives the output:

{ "_id" : "Jesse", "votes" : 148 }
{ "_id" : "Mirek", "votes" : 135 }
{ "_id" : "Leszke", "votes" : 13 }

Even as skipping the first $group stage and making a combined array a different way:

db.collection.aggregate([
    { "$project": {
        "combined": { 
            "$setUnion": [
                { "$map": {
                    "input": { "$literal": ["A"] },
                    "as": "el",
                    "in": { 
                        "author": "$author",
                        "votes": "$votes"
                    }
                }},
                "$comments"
            ] 
        }
    }},
    { "$unwind": "$combined" },
    { "$group": {
        "_id": "$combined.author",
        "votes": { "$sum": "$combined.votes" }
    }},
    { "$sort": { "votes": -1 } }
])

Those use operators such as $setUnion and even $map which were introduced as of MongoDB 2.6. This makes it simplier, but it can still be done in earlier versions lacking those operators, following much the same principles:

db.collection.aggregate([
    { "$project": {
        "author": 1,
        "votes": 1,
        "comments": 1,
        "type": { "$const": ["A","B"] }
    }},
    { "$unwind": "$type" },
    { "$unwind": "$comments" },
    { "$group": { 
        "_id": {
          "$cond": [
              { "$eq": [ "$type", "A" ] },
              { 
                  "id": "$_id", 
                  "author": "$author",
                  "votes": "$votes"
              },
              "$comments"
          ]
        }
    }},
    { "$group": {
        "_id": "$_id.author",
        "votes": { "$sum": "$_id.votes" }
    }},
    { "$sort": { "votes": -1 } }
])

The $const is undocumented but present in all versions of MongoDB where the aggregation framework is present ( from 2.2 ). MongoDB 2.6 Introduced $literal which essentially links to the same underlying code. It's been used in two cases here to either provide a template element for an array, or as introducing an array to unwind in order to provide a "binary choice" between two actions.



回答2:

You could aggregate the results as below:

  • Unwind the comments array.
  • Group the records together to first calculate the sum of the votes received by each author in his comments. Meanwhile keep the original posts in tact.
  • Unwind by the original post array.
  • Now project the sum for each author.
  • Sort by name and votes of the author.
  • Select the first record from each group to eliminate duplicates.

Code:

db.collection.aggregate([
{$unwind:"$comments"},
{$group:{"_id":null,
         "comments":{$push:"$comments"},
         "post":{$addToSet:{"author":"$author",
                            "votes":"$votes"}}}},
{$unwind:"$comments"},
{$group:{"_id":"$comments.author",
         "votes":{$sum:"$comments.votes"},
         "post":{$first:"$post"}}},
{$unwind:"$post"},
{$project:{"_id":1,
           "votes":{$cond:[{$eq:["$_id","$post.author"]},
                           {$add:["$votes","$post.votes"]},
                           "$votes"]}}},
{$sort:{"_id":-1,"votes":-1}},
{$group:{"_id":"$_id","votes":{$first:"$votes"}}}
])

Sample o/p:

{ "_id" : "Leszke", "votes" : 13 }
{ "_id" : "Jesse", "votes" : 148 }
{ "_id" : "Mirek", "votes" : 135 }