Aggregate $sum values from two arrays

2019-04-09 06:03发布

问题:

I have this collection

{
    "_id" : ObjectId("54f46f18c36dcc206d0cec38"),
    "project" : 23123,
    "title" : "Change of windows",
    "description": "Change to better windows on building A"
    "costs":[
      {
        category: 'Produktionskostnad',
        value: 3000
      },
      {
        category: 'Projekteringskostnad',
        value: 2000
      },
      {
        category: 'Overhead',
        value: 1000
      }
    ],
    "energySaving" : [ 
        {
            "energy" : "electricity",
            "type" : "lighting",
            "value" : 24324
        }, 
        {
            "energy" : "electricity",
            "type" : "equipment",
            "value" : 24324
        }, 
        {
            "energy" : "electricity",
            "type" : "fans",
            "value" : 24324
        }, 
        {
            "energy" : "electricity",
            "type" : "distribution",
            "value" : 24324
        }, 
        {
            "energy" : "electricity",
            "type" : "chiller",
            "value" : 24324
        }, 
        {
            "energy" : "electricity",
            "type" : "other",
            "value" : 24324
        }
    ]
}

I need a aggregation that calculates the total cost and the total energySaving.

To get the saving I have this query:

db.collection.aggregate( [
    { $unwind: "$energySaving" },
    { 
       $group: {
          _id: {
             title: '$title',
             description: '$description' 
          },
          totalEnergySaving: { $sum: '$energySaving.value' } 
       } 
    }
]);

But how do I calculate the total cost in the same query? I cant add $unwind cost in the same query. Can I "reset" the $group somehow and to the query again?

回答1:

TLDR;

In Modern MongoDB releases we simply do a single $group since we can pass the array items directly to $sum with the "double" notation for "array sum" and "accumulator" in succession:

db.collection.aggregate([
    { "$group": {
        "_id": {
            "title": "$title",
            "description": "$description"
        },
        "totalCosts": { "$sum": { "$sum": "$costs.value" } },
        "totalEnergySaving": { "$sum": { "$sum": "$energySaving.value" } }
     }}
 ])

Original 2015 answer

This takes a bit of juggling to do correctly but the best way to describe it is "deal with grouping per document first" and then "group the totals later":

db.collection.aggregate([
    // Do cost per document
    { "$unwind": "$costs" },
    { "$group": {
        "_id": "$_id",
        "title": { "$first": "$title" },
        "description": { "$first": "$description" },
        "totalCosts": { "$sum": "$costs.value" },
        "energySaving": { "$first": "$energySaving" }
    }},

    // Do energy saving per document
    { "$unwind": "$energySaving" },
    { "$group": {
        "_id": "$_id",
        "title": { "$first": "$title" },
        "description": { "$first": "$description" },
        "totalCosts": { "$first": "$totalCosts" },
        "totalEnergySaving": { "$sum": "$energySaving.value" }
    }},

    // Now sum the real grouping
    { "$group": {
        "_id": {
            "title": "$title",
            "description": "$description"
        },
        "totalCosts": { "$sum": "$totalCosts" },
        "totalEnergySaving": { "$sum": "$totalEnergySaving" }
    }}
])

By working out of the array values a singular value per document and also by unwinding and grouping "one array at a time" to avoid the replication of items per array member, you form a base to to the singular grouping that you actually want.

So when you $unwind an array you get multiple copies of the document with each array member now represented as a singular value in each document copy. What you do not want to do here is $unwind another array while you already have one un-wound because that will create as many "more copies" of the document per how many members that array has in the same way.

Using $group back to the document _id value at this point ensures we are only working the the original parts of the document that was initially "un-wound". Normal grouping operators like $sum still apply, but $first can be used to just pull up "only one" of those copied field values "outside the array" and pretty much return the document to it's "original form" for the fields you want to keep and also with anything you intentionally aggregated from the array contents.

Repeat for each array you want to, then move on to another $group statement that this time adds up more than just one document at a time with the new singular summed values that you made earlier.

That's the process to adding up multiple array items in any level of grouping. Of course if the only grouping was being done at the document level anyway, you could just give up after grouping each array, or indeed accept that it was probably better to do in client side code anyway.