Count Distinct Sub document field and Output as Na

2020-04-01 03:42发布

问题:

In MongoDB, If I had a collection that looked like the following,

{
    "auctionId" : 22,
    "startDt" : "2017-08-28T06:00:00.000Z",
    "endDt" : "2017-09-04T06:00:00.000Z",
    "status" : "Open",
    "pickupDt" : "2017-09-07T06:00:00.000Z",
    "itmLst" : 
    [ 
        {
            "itemId" : 1,
            "location" : "Open",
            "currentBid" : 13.0,
            "highBidder" : 1897,
            "bidCnt" : 4,
            "catgegory" : "ANTIQUES"
             ...

How could i query it with an aggregate function to retrieve a count of the distinct categories for auctionId = 22, grouping by itmLst.category, so it would return a result set that would look like this:

{
    "ANTIQUES": 56,
    "TOOLS": 89,
    "JEWLRY": 45,
    ...
}

回答1:

The basic case here is to use .aggregate() with $unwind because you need access to the values in the array as your grouping keys and of course $group because that is how you "group" things:

db.collection.aggregate([
  { "$match": { "auctionId": 22 } },
  { "$unwind": "$itmLst" },
  { "$group": {
    "_id": "$itmLst.category",
    "count": { "$sum": 1 }
  }}
])

This will give you output like:

{ "_id": "ANTIQUES", "count": 56 }
{ "_id": "TOOLS", "count": 89 }
{ "_id": "JEWLRY", "count":  45 }

Now you really should learn to live with that, because a "list" in the default cursor format is a good thing which is naturally iterable. Also IMHO named keys do not naturally lend themselves to data presentation and you generally want a common property in an iterable list.

If you are really intent on using the singular named keys output, then you are either going to need MongoDB 3.4.4 or greater to have access to $arrayToObject that will allow you to use the values as the names of keys, and of course $replaceRoot in order to use that expression output as the new document to produce:

db.collection.aggregate([
  { "$match": { "auctionId": 22 } },
  { "$unwind": "$itmLst" },
  { "$group": {
    "_id": "$itmLst.category",
    "count": { "$sum": 1 }
  }},
  { "$group": {
    "_id": null,
    "data": { "$push": { "k": "$_id", "v": "$count" } }
  }},
  { "$replaceRoot": {
    "newRoot": {
      "$arrayToObject": "$data"
    }
  }}
])

Or if you don't have that option, then instead you should be converting the cursor output in code:

db.collection.aggregate([
  { "$match": { "auctionId": 22 } },
  { "$unwind": "$itmLst" },
  { "$group": {
    "_id": "$itmLst.category",
    "count": { "$sum": 1 }
  }}
]).toArray().reduce((acc,curr) => 
  Object.assign(acc,{ [curr._id]: curr.count }),
  {}
)

Both merge into a single object with named keys from the original aggregation output:

{
    "ANTIQUES": 56,
    "TOOLS": 89,
    "JEWLRY": 45,
    ...
}

And that goes to show that the original output result was really enough, and that typically you want that kind of "final reshaping" to be done in the code that uses the cursor output, if you really even need that reshaping at all since the basic data needed was returned anyway.



回答2:

I have found the answer i was looking for. This query:

db.auctions.aggregate([
   { "$match": { "auctionId": 22 } },
   { $unwind:'$itmLst' }, 
   { $group:{_id:'$itmLst.catgegory', freq:{$sum:1}} },
   { $sort:{ "_id": 1 } }
]);

returns:

{
    "_id" : "ANTIQUES",
    "freq" : 9.0
}
{
    "_id" : "APPLIANCES",
    "freq" : 1.0
}
{
    "_id" : "ARTS CRAFTS",
    "freq" : 9.0
}
{
    "_id" : "BOOKS MAGAZINES",
    "freq" : 6.0
}

Thank you Neil for pointing me in the right direction though.