I have 2 collections (with example documents):
reports
{
id: "R1",
type: "xyz",
}
reportfiles
{
id: "F1",
reportid: "R1",
time: ISODate("2016-06-13T14:20:25.812Z")
},
{
id: "F14",
reportid: "R1",
time: ISODate("2016-06-15T09:20:29.809Z")
}
As you can see one report
may have multiple reportfiles
.
I'd like to perform a query, matching a report id
, returning the report document as is, plus an additional key storing as subdocument the reportfile
with the most recent time
(even better without reportid
, as it would be redundant), e.g.
{
id: "R1",
type: "xyz",
reportfile: {
id: "F14",
reportid: "R1",
time: ISODate("2016-06-15T09:20:29.809Z")
}
}
My problem here is that every report type has its own set of properties, so using $project
in an aggregation pipeline is not the best way.
So far I got
db.reports.aggregate([{
$match : 'R1'
}, {
$lookup : {
from : 'reportfiles',
localField : 'id',
foreignField : 'reportid',
as : 'reportfile'
}
}
])
returning of course as ´reportfile´ the list of all files with the given reportid
. How can I efficiently filter that list to get the only element I need?
efficiently -> I tried using $unwind
as next pipeline step but the resulting document was frighteningly and pointlessly long.
Thanks in advance for any suggestion!
You need to add another $project
stage to your aggregation pipeline after the $lookup
stage.
{ "$project": {
"id": "R1",
"type": "xyz",
"reportfile": {
"$let": {
"vars": {
"obj": {
"$arrayElemAt": [
{ "$filter": {
"input": "$reportfile",
"as": "report",
"cond": { "$eq": [ "$$report.time", { "$max": "$reportfile.time" } ] }
}},
0
]
}
},
"in": { "id": "$$obj.id", "time": "$$obj.time" }
}
}
}}
The $filter
operator "filter" the $lookup
result and return an array with the document that satisfy your condition. The condition here is $eq
which return true when the document has the $max
imum value.
The $arrayElemAt
operator slice the $filter's result and return the element from the array that you then assign to a variable using the $let
operator. From there, you can easily access the field you want in your result with the dot notation.
What you would require is to run the aggregation operation on the reportfile
collection, do the "join" on the reports
collection, pipe a $group
operation to ordered (with $sort
) and flattened documents (with $unwind
) from the $lookup
pipeline. The preceding result can then be grouped by the reportid
and output the desired result using the $first
accumulator aoperators.
The following demonstrates this approach:
db.reportfiles.aggregate([
{ "$match": { "reportid": "R1" } },
{
"$lookup": {
"from": 'reports',
"localField" : 'reportid',
"foreignField" : 'id',
"as": 'report'
}
},
{ "$unwind": "$report" },
{ "$sort": { "time": -1 } },
{
"$group": {
"_id": "$reportid",
"type": { "$first": "$report.type" },
"reportfile": {
"$first": {
"id": "$id",
"reportid": "$reportid",
"time": "$time"
}
}
}
}
])
Sample Output:
{
"_id" : "R1",
"type" : "xyz",
"reportfile" : {
"id" : "F14",
"reportid" : "R1",
"time" : ISODate("2016-06-15T09:20:29.809Z")
}
}