Return the last “true” value for each group

2019-01-12 10:04发布

问题:

I have collection in which documents are like:

{
    _id: ObjectId(),
     user: ObjectId(),  
     studentName: String,
     createdAt: Date,
     isAbondoned: boolean
}

example of documents are:

1-

{ 
    "_id" : ObjectId("56cd2d36a489a5b875902f0e"), 
    "user" : ObjectId("56c4cafabd5f92cd78ae49d4"), 
    "studentName" : "Aman", 
    "createdAt" : ISODate("2016-02-24T04:10:30.486+0000"), 
    "isAbandoned" : true
}

2-

{ 
    "_id" : ObjectId("56cd2dcda489a5b875902fcd"), 
    "user" : ObjectId("56c4cafabd5f92cd78ae49d4"), 
    "studentName" : "Aman",  
    "createdAt" : ISODate("2016-02-24T04:13:01.932+0000"), 
    "isAbandoned" : false
}

3-

{ 
    "_id" : ObjectId("56cee51503b7cb7b0eda9c4c"), 
    "user" : ObjectId("56c85244bd5f92cd78ae4bc1"), 
    "studentName" : "Rajeev",
    "createdAt" : ISODate("2016-02-25T11:27:17.281+0000"), 
    "isAbandoned" : true, 
}

Now I want to find the list of students for which their 'isAbandoned' is true for their last 'createdAt' document.

Required output for above example is:

{
    "user" : ObjectId("56c85244bd5f92cd78ae4bc1"), 
    "studentName" : "Rajeev"
}

because for studentName "Aman" max(createdAt) is 2nd document and 'isAbandoned' is false for that.

回答1:

The best way to do this is using the aggregation framework. You need to $group your documents by "user" and return the last document for each user using the $last accumulator operator but for this to work, you need a preliminary sorting stage using the $sort aggregation pipeline operator. To sort your documents, you need to consider both the "createdAt" field and the "user" field.

The last stage in the pipeline is the $match stage where you select only those last documents where "isAbandoned" equals true.

db.students.aggregate([
    { "$sort": { "user": 1, "createdAt": 1 } }, 
    { "$group": { 
        "_id": "$user", 
        "last": { "$last": "$$ROOT" }
    }}, 
    { "$match": { "last.isAbandoned": true } }
])

which returns something like this:

{ 
    "_id" : ObjectId("56c85244bd5f92cd78ae4bc1"),
    "last" : {
        "_id" : ObjectId("56cee51503b7cb7b0eda9c4c"),
        "user" : ObjectId("56c85244bd5f92cd78ae4bc1"),
        "studentName" : "Rajeev",
        "createdAt" : ISODate("2016-02-25T11:27:17.281Z"),
        "isAbandoned" : true
    }
}

To get the expected result, we need to use the $replaceRoot pipeline operator starting from verion 3.4 to promote the embedded document to the top level

{
    $replaceRoot: { newRoot: "$last" }
}

In older version, you need to use the $project aggregation pipeline operation to reshape our documents. So if we extend our pipeline with the following stage:

{ 
    "$project": { 
        "_id": "$last._id", 
        "user": "$last.user", 
        "studentName": "$last.studentName", 
        "createdAt": "$last.createdAt", 
        "isAbandoned": "$last.isAbandoned"
}}

it produces the expected output:

{
    "_id" : ObjectId("56cee51503b7cb7b0eda9c4c"),
    "user" : ObjectId("56c85244bd5f92cd78ae4bc1"),
    "studentName" : "Rajeev",
    "createdAt" : ISODate("2016-02-25T11:27:17.281Z"),
    "isAbandoned" : true
}


回答2:

This is a good example of need to group data by specific filed (createdAt) and then compare result set match criteria.

  1. find max by student id,
  2. match only entries by max entry = createdAt
  3. check if they are passing criteria
  4. reshape document

Code:

db.student.aggregate([{
   $group : {
       _id : "$user",
        created : {
            $max : "$createdAt"
        },
        documents : {
            $push : "$$ROOT"
        }
    }
 }, {
$project : {
    _id : 0,
    documents : {
        $filter : {
        input : "$documents",
        as : "item",
            cond : {
                $eq : ["$$item.createdAt", "$created"]
            }
        }
    }}
}, {
$match : {
    "documents.isAbandoned" : true
}},
{ $unwind : "$documents" },
{
$project : {
    _id : "$documents._id",
    user : "$documents.user",
    studentName : "$documents.studentName",
    createdAt : "$documents.createdAt",
    isAbandoned : "$documents.isAbandoned",
 }}
])