MongoDb How to group by month and year from string

2020-02-29 04:45发布

问题:

I am having field dateStr in collection

  {  ....    "dateStr" : "07/01/2020" .... }
  {  ....    "dateStr" : "07/01/1970"   .... }

I want to group by month and year from dateStr field

I have tried

    db.collection.aggregate( 
       {$project : { 
              month : {$month : new Date("$dateStr")}, 
              year : {$year :  new Date("$dateStr")}
          }}, 
        {$group : { 
                _id : {month : "$month" ,year : "$year" },  
              count : {$sum : 1} 
        }})

Output :

{
"result" : [ 
        {
            "_id" : {
                "month" : 1,
                "year" : 1970
            },
            "count" : 2
        }
    ],
    "ok" : 1
}

But I am having two years 1970,2020. Why I am getting single record?

回答1:

You cannot use the date aggregation operators on anything else that is tho a Date object itself. Your ultimate best option is to convert these "strings" to proper Date objects so you can query correctly in this and future operations.

That said, if your "strings" always have a common structure then there is a way to do this with the aggregation framework tools. It requires a lot of manipulation thought that does not makes this an "optimal" approach to dealing with the problem. But with a set structure of "double digits" and a consistent delimiter this is possible with the $substr operator:

db.collection.aggregate([
   { "$group": {
       "_id": {
           "year": { "$substr": [ "$dateStr", 7, 4 ] },
           "month": { "$substr": [ "$dateStr", 4, 2 ] }
       },
       "count": { "$sum": 1 }
   }}
])

So JavaScript casting does not work inside the aggregation framework. You can always "feed" input to the pipeline based on "client code" evaluation, but the aggregation process itself does not evaluate any code. Just like the basic query engine, this is all based on a "data structure" implementation that uses "native operator" instructions to do the work.

You cannot convert strings to dates in the aggregation pipeline. You should work with real BSON Date objects, but you can do it with strings if there is a consistent format that you can present in a "lexical order".

I still suggest that you convert these to BSON Dates ASAP. And beware that the "ISODate" or UTC value is constructed with a different string form. ie:

new Date("2020-01-07")

Being in "yyyy-mm-dd" format. At least for the JavaScript invocation.