I have collection in which documents are like:
{
_id: ObjectId(),
user: ObjectId(),
studentName: String,
createdAt: Date,
isAbondoned: boolean
}
example of documents are:
1-
{
"_id" : ObjectId("56cd2d36a489a5b875902f0e"),
"user" : ObjectId("56c4cafabd5f92cd78ae49d4"),
"studentName" : "Aman",
"createdAt" : ISODate("2016-02-24T04:10:30.486+0000"),
"isAbandoned" : true
}
2-
{
"_id" : ObjectId("56cd2dcda489a5b875902fcd"),
"user" : ObjectId("56c4cafabd5f92cd78ae49d4"),
"studentName" : "Aman",
"createdAt" : ISODate("2016-02-24T04:13:01.932+0000"),
"isAbandoned" : false
}
3-
{
"_id" : ObjectId("56cee51503b7cb7b0eda9c4c"),
"user" : ObjectId("56c85244bd5f92cd78ae4bc1"),
"studentName" : "Rajeev",
"createdAt" : ISODate("2016-02-25T11:27:17.281+0000"),
"isAbandoned" : true,
}
Now I want to find the list of students for which their 'isAbandoned' is true for their last 'createdAt' document.
Required output for above example is:
{
"user" : ObjectId("56c85244bd5f92cd78ae4bc1"),
"studentName" : "Rajeev"
}
because for studentName "Aman" max(createdAt) is 2nd document and 'isAbandoned' is false for that.
The best way to do this is using the aggregation framework. You need to $group
your documents by "user" and return the last document for each user using the $last
accumulator operator but for this to work, you need a preliminary sorting stage using the $sort
aggregation pipeline operator. To sort your documents, you need to consider both the "createdAt" field and the "user" field.
The last stage in the pipeline is the $match
stage where you select only those last documents where "isAbandoned" equals true
.
db.students.aggregate([
{ "$sort": { "user": 1, "createdAt": 1 } },
{ "$group": {
"_id": "$user",
"last": { "$last": "$$ROOT" }
}},
{ "$match": { "last.isAbandoned": true } }
])
which returns something like this:
{
"_id" : ObjectId("56c85244bd5f92cd78ae4bc1"),
"last" : {
"_id" : ObjectId("56cee51503b7cb7b0eda9c4c"),
"user" : ObjectId("56c85244bd5f92cd78ae4bc1"),
"studentName" : "Rajeev",
"createdAt" : ISODate("2016-02-25T11:27:17.281Z"),
"isAbandoned" : true
}
}
To get the expected result, we need to use the $replaceRoot
pipeline operator starting from verion 3.4 to promote the embedded document to the top level
{
$replaceRoot: { newRoot: "$last" }
}
In older version, you need to use the $project
aggregation pipeline operation to reshape our documents. So if we extend our pipeline with the following stage:
{
"$project": {
"_id": "$last._id",
"user": "$last.user",
"studentName": "$last.studentName",
"createdAt": "$last.createdAt",
"isAbandoned": "$last.isAbandoned"
}}
it produces the expected output:
{
"_id" : ObjectId("56cee51503b7cb7b0eda9c4c"),
"user" : ObjectId("56c85244bd5f92cd78ae4bc1"),
"studentName" : "Rajeev",
"createdAt" : ISODate("2016-02-25T11:27:17.281Z"),
"isAbandoned" : true
}
This is a good example of need to group data by specific filed (createdAt) and then compare result set match criteria.
- find max by student id,
- match only entries by max entry = createdAt
- check if they are passing criteria
- reshape document
Code:
db.student.aggregate([{
$group : {
_id : "$user",
created : {
$max : "$createdAt"
},
documents : {
$push : "$$ROOT"
}
}
}, {
$project : {
_id : 0,
documents : {
$filter : {
input : "$documents",
as : "item",
cond : {
$eq : ["$$item.createdAt", "$created"]
}
}
}}
}, {
$match : {
"documents.isAbandoned" : true
}},
{ $unwind : "$documents" },
{
$project : {
_id : "$documents._id",
user : "$documents.user",
studentName : "$documents.studentName",
createdAt : "$documents.createdAt",
isAbandoned : "$documents.isAbandoned",
}}
])