Can Mongo use a reference and compare values durin

2019-09-08 16:06发布

问题:

Given this document:

{
        "_id" : ObjectId("53f7287881a97c71e58e3514"),
        "title" : "My Page",
        "currentVersion" : 1,
        "versions" : [ 
            {
                "version" : 0,
                "content" : [ 
                    {
                        "data" : "foo",
                    }
                ]
            }, 
            {
                "version" : 1,
                "content" : [ 
                    {
                        "data" : "bar",
                    }
                ]
            }
        ]
}

Is there anyway, (via the aggregation framework or otherwise) to perform a document projection where the only content in versions is the one that satisfies versions.version==currentVersion for each document? I understand that the first step might be to $unwind but I'm stumped as to how to filter out documents where currentVersion!=versions.version.

Ideally, I would like to dispense with versions.version and use currentVersion as a positional pointer for versions.

Thanks!

Update

I've come up with the following, which does the trick, but feels slightly inelegant

db.test.aggregate(
    [        
        { $unwind: "$versions" },
        { $project: 
            {
                title:1,
                versionMatch: { $cmp: [ "$currentVersion","$versions.version"] },
                content: "$versions.content"
            }
        },
        { $match: 
            {
                versionMatch: {$eq:0}
            }
        },
        { $project: 
            {
                title:1,
                content: 1              
            }
        },
    ]
)

The better solution would be to be able to pluck the entry directly from versionsusing currentVersion as a pointer but I am not sure that's entirely possible.

回答1:

Sure you can improve upon this. Provided you have MongoDB 2.6 or greater then you can actually filter your already unique versions without needing to process an $unwind. This makes use of the $map and $setDifference operators introduced as of that release:

db.test.aggregate([
    { "$project": {
       "title": 1,
       "content": {
           "$setDifference": [
               { "$map": {
                   "input": "$versions",
                   "as": "el",
                   "in": {
                       "$cond": [
                           { "$eq": [ "$currentVersion", "$$el.version" ] },
                           "$$el.content",
                           false
                       ]
                   }
               }},
               [false]
           ]
       }
    }}
])

Or even possibly with $redact that is also introduced. Mind that this is recursive so this involves an arbitrary "project" of a matching value where the field to compare does not exist:

db.test.aggregate([
   { "$redact": {
       "$cond": [
           { "$eq": [ 
               { "$ifNull": [ "$version", "$$ROOT.currentVersion" ] },
               "$ROOT.currentVersion"
           ]},
           "$$DESCEND",
           "$$PRUNE"
       ]
   }}
])

Both will match the "versions" content to the "currentVersion" and remove other elements, though both do still keep a nested array and even the latter without the prettier projection, but you can always process $unwind on these with little performance penalty as the data is already reduced to one match.

Earlier versions of MongoDB of course do not have the operators required to do this, so you are pretty much stuck with using $unwind and $project to get a logical match between the two:

db.test.aggregate([
  { "$unwind": "$versions" },
  { "$project": {
      "title": 1,
      "content": "$versions.content",
      "matched": {
          "$eq": [ "$currentVersion", "$versions.version" ]
      }  
  }},
  { "$match": { "matched": true } }
])

Noting that in all cases the best logical test is with $eq as $cmp should only be needed where the BSON Type varies such as NumberLong and NumberInt or you actually intend to compare types in by comparison order. It should also be more efficient being a basic test.

For larger result sets, dropping that final $project helps as well since the cost of processing through all results just to strip an unwanted field outweighs any benefit of doing so, and your generally better of ignoring or otherwise removing in post processing.

But where possible the first two forms will give the best performance which is largely due to removing the need to "unwind" which produces significantly more documents to process in the pipeline stages.