Calculate the count of nested objects with C#

2019-01-29 10:25发布

问题:

I am developing a software using ASP.Net core 2.0 web api. I need to calculate count of some fields in my collection. I have data in my collection in MongoDB as below. I need to find how many Tags and how many sensors do I have in my collection. A specific endpoint has multi tags and each tag has multi sensors.

{
    "_id" : ObjectId("5aef51dfaf42ea1b70d0c4db"),    
    "EndpointId" : "89799bcc-e86f-4c8a-b340-8b5ed53caf83",    
    "DateTime" : ISODate("2018-05-06T19:05:02.666Z"),
    "Url" : "test",
    "Tags" : [ 
        {
            "Uid" : "C1:3D:CA:D4:45:11",
            "Type" : 1,
            "DateTime" : ISODate("2018-05-06T19:05:02.666Z"),
            "Sensors" : [ 
                {
                    "Type" : 1,
                    "Value" : NumberDecimal("-95")
                }, 
                {
                    "Type" : 2,
                    "Value" : NumberDecimal("-59")
                }, 
                {
                    "Type" : 3,
                    "Value" : NumberDecimal("11.029802536740132")
                }, 
                {
                    "Type" : 4,
                    "Value" : NumberDecimal("27.25")
                }, 
                {
                    "Type" : 6,
                    "Value" : NumberDecimal("2924")
                }
            ]
        },         
        {
            "Uid" : "C1:3D:CA:D4:45:11",
            "Type" : 1,
            "DateTime" : ISODate("2018-05-06T19:05:02.666Z"),
            "Sensors" : [ 
                {
                    "Type" : 1,
                    "Value" : NumberDecimal("-95")
                }, 
                {
                    "Type" : 2,
                    "Value" : NumberDecimal("-59")
                }, 
                {
                    "Type" : 3,
                    "Value" : NumberDecimal("11.413037961112279")
                }, 
                {
                    "Type" : 4,
                    "Value" : NumberDecimal("27.25")
                }, 
                {
                    "Type" : 6,
                    "Value" : NumberDecimal("2924")
                }
            ]
        },          
        {
            "Uid" : "E5:FA:2A:35:AF:DD",
            "Type" : 1,
            "DateTime" : ISODate("2018-05-06T19:05:02.666Z"),
            "Sensors" : [ 
                {
                    "Type" : 1,
                    "Value" : NumberDecimal("-97")
                }, 
                {
                    "Type" : 2,
                    "Value" : NumberDecimal("-58")
                }, 
                {
                    "Type" : 3,
                    "Value" : NumberDecimal("10.171658037099185")
                }
            ]
        }
    ]
}

/* 2 */
{
    "_id" : ObjectId("5aef51e0af42ea1b70d0c4dc"),    
    "EndpointId" : "89799bcc-e86f-4c8a-b340-8b5ed53caf83",    
    "Url" : "test",
    "Tags" : [ 
        {
            "Uid" : "E2:02:00:18:DA:40",
            "Type" : 1,
            "DateTime" : ISODate("2018-05-06T19:05:04.574Z"),
            "Sensors" : [ 
                {
                    "Type" : 1,
                    "Value" : NumberDecimal("-98")
                }, 
                {
                    "Type" : 2,
                    "Value" : NumberDecimal("-65")
                }, 
                {
                    "Type" : 3,
                    "Value" : NumberDecimal("7.845424441900629")
                }, 
                {
                    "Type" : 4,
                    "Value" : NumberDecimal("0.0")
                }, 
                {
                    "Type" : 6,
                    "Value" : NumberDecimal("3012")
                }
            ]
        }, 
        {
            "Uid" : "12:3B:6A:1A:B7:F9",
            "Type" : 1,
            "DateTime" : ISODate("2018-05-06T19:05:04.574Z"),
            "Sensors" : [ 
                {
                    "Type" : 1,
                    "Value" : NumberDecimal("-95")
                }, 
                {
                    "Type" : 2,
                    "Value" : NumberDecimal("-59")
                }, 
                {
                    "Type" : 3,
                    "Value" : NumberDecimal("12.939770381907275")
                }
            ]
        }
    ]
}

I want to calculate the count of Tags and Sensors related to specific EndpointId. How can I write that query in mongoDB?

回答1:

The query to count the "unique" occurances within an "EndpointId" of each of the "Uid" in "Tags" and the "Type" in "Sensors" would be:

db.collection.aggregate([
  { "$unwind": "$Tags" },
  { "$unwind": "$Tags.Sensors" },
  { "$group": {
    "_id": {
      "EndpointId": "$EndpointId",
      "Uid": "$Tags.Uid",
      "Type": "$Tags.Sensors.Type"
    },
  }},
  { "$group": {
    "_id": {
      "EndpointId": "$_id.EndpointId",
      "Uid": "$_id.Uid",
    },
    "count": { "$sum": 1 }
  }},
  { "$group": {
    "_id": "$_id.EndpointId",
    "tagCount": { "$sum": 1 },
    "sensorCount": { "$sum": "$count" }
  }}
])

Or for C#

    var results = collection.AsQueryable()
      .SelectMany(p => p.Tags, (p, tag) => new
        {
          EndpointId = p.EndpointId,
          Uid = tag.Uid,
          Sensors = tag.Sensors
        }
      )
      .SelectMany(p => p.Sensors, (p, sensor) => new
        {
          EndpointId = p.EndpointId,
          Uid = p.Uid,
          Type = sensor.Type
        }
      )
      .GroupBy(p => new { EndpointId = p.EndpointId, Uid = p.Uid, Type = p.Type })
      .GroupBy(p => new { EndpointId = p.Key.EndpointId, Uid = p.Key.Uid },
        (k, s) => new { Key = k, count = s.Count() }
      )
      .GroupBy(p => p.Key.EndpointId,
        (k, s) => new
        {
          EndpointId = k,
          tagCount = s.Count(),
          sensorCount = s.Sum(x => x.count)
        }
      );

Which outputs:

{
  "EndpointId" : "89799bcc-e86f-4c8a-b340-8b5ed53caf83",
  "tagCount" : 4,
  "sensorCount" : 16
}

Though actually the "most efficient" way to do this considering that the documents presented have unique values for "Uid" anyway would be to $reduce the amounts within the documents itself:

db.collection.aggregate([
  { "$group": {
    "_id": "$EndpointId",
    "tags": {
      "$sum": {
        "$size": { "$setUnion": ["$Tags.Uid",[]] }
      }
    },
    "sensors": {
      "$sum": {
        "$sum": {
          "$map": {
            "input": { "$setUnion": ["$Tags.Uid",[]] },
            "as": "tag",
            "in": {
              "$size": {
                "$reduce": {
                  "input": {
                    "$filter": {
                      "input": {
                        "$map": {
                          "input": "$Tags",
                          "in": {
                            "Uid": "$$this.Uid",
                            "Type": "$$this.Sensors.Type"
                          }
                        }
                      },
                      "cond": { "$eq": [ "$$this.Uid", "$$tag" ] }
                    }
                  },
                  "initialValue": [],
                  "in": { "$setUnion": [ "$$value", "$$this.Type" ] }
                }
              }
            }
          }
        }
      }
    }
  }}
])

The statement does not really map well to LINQ however, so you would be required to use the BsonDocument interface to build the BSON for the statement. And of course where the same "Uid" values "did" in fact occur within multiple documents in the collection, then the $unwind statements are necessary in order to "group" those together across documents from within the array entries.


Original

You solve this by obtaining the $size of the arrays. For the outer array this is simply applying to field path of the array in the document, and for the inner array items you need to process with $map in order to process each "Tags" element and then obtain the $size of "Sensors" and $sum the resulting array to reduce to the overall count.

Per document that would be:

db.collection.aggregate([
  { "$project": {
    "tags": { "$size": "$Tags" },
    "sensors": {
      "$sum": {
        "$map": {
          "input": "$Tags",
           "in": { "$size": "$$this.Sensors" }
        }
      }
    }
  }}
])

Which where you have assigned to classes in your C# code would be like:

collection.AsQueryable()
  .Select(p => new
    {
      tags = p.Tags.Count(),
      sensors = p.Tags.Select(x => x.Sensors.Count()).Sum()
    }
  );

Where those return:

{ "tags" : 3, "sensors" : 13 }
{ "tags" : 2, "sensors" : 8 }

Where you want to $group the results, as for example over the whole collection, then you would do:

db.collection.aggregate([
  /* The shell would use $match for "query" conditions */
  //{ "$match": { "EndpointId": "89799bcc-e86f-4c8a-b340-8b5ed53caf83" } },
  { "$group": {
    "_id": null,
    "tags": { "$sum": { "$size": "$Tags" } },
    "sensors": {
      "$sum": {
        "$sum": {
          "$map": {
            "input": "$Tags",
             "in": { "$size": "$$this.Sensors" }
          }
        }
      }
    }
  }}
])

Which for your C# code like before would be:

collection.AsQueryable()
  .GroupBy(p => "", (k,s) => new
    {
      tags = s.Sum(p => p.Tags.Count()),
      sensors = s.Sum(p => p.Tags.Select(x => x.Sensors.Count()).Sum())
    }
  );

Where those return:

{ "tags" : 5, "sensors" : 21 }

And for "EndpointId, then you simply use that field as the grouping key, rather than the null or 0 as it gets applied by the C# driver mapping:

collection.AsQueryable()
  /* Use the Where if you want a query to match only those documents */
  //.Where(p => p.EndpointId == "89799bcc-e86f-4c8a-b340-8b5ed53caf83")            
  .GroupBy(p => p.EndpointId, (k,s) => new
    {
      tags = s.Sum(p => p.Tags.Count()),
      sensors = s.Sum(p => p.Tags.Select(x => x.Sensors.Count()).Sum())
    }
  );

Which is of course the same sum of the two document sample you gave us:

{ "tags" : 5, "sensors" : 21 }

So these are very simple results, with simple pipeline execution once you get used to the syntax.

I suggest familiarizing yourself with the Aggregation Operators from the core documentation, and of course the "LINQ Cheat Sheet" of expressions and their usage mapping from withing the C# Driver code repository.

Also see the general LINQ Reference in the C# Driver reference for other examples of how this maps onto the Aggregation Framework of MongoDB in general.