MongoDB elemmatch multiple elements in array

2019-01-22 22:47发布

问题:

I have an mongodb document like

    {
        "_id" : ObjectId("54e66b2da7b5f3a92e09dc6c"),
        "SomeMetric" : [ 
            {
                //some object
            }
            {
                //some object
            } 
         ],
        "FilterMetric" : [ 
            {
                "min" : "0.00",
                "max" : "16.83",
                "avg" : "0.00",
                "class" : "s1"
            }, 
            {
                "min" : "0.00",
                "max" : "16.83",
                "avg" : "0.00",
                "class" : "s2"
            }, 
            {
                "min" : "0.00",
                "max" : "16.83",
                "avg" : "0.00",
                "class" : "s1"
            }, 
            {
                "min" : "0.00",
                "max" : "16.83",
                "avg" : "0.00",
                "class" : "s2"
            } 
        ]
    }

Typically it contains many nested arrays like this. I want to project one metric alone, with only the arrays that have my search criteria. I have the query

db.sample.find(
{"filtermetric.class" : "s2"},{"filtermetric" : { $elemMatch : {class: "s2"}}}
)

This gives me only the first object in the array. The second object with class : s2 is not returned.

If i try

    db.sample.find(
   {"filtermetric" : { $elemMatch : {class: "s2"}}}
    )

It give me all the 4 objects in the array.

How do I get all the objects that match a criteria in such a case?

回答1:

You cannot return multiple elements of an array matching your criteria in any form of a basic .find() query. To match more than one element you need to use the .aggregate() method instead.

The main difference here is that the "query" does exactly what it is intended to do and matches "documents" that meet your conditions. You can try to use the positional $ operator within a projection argument, but the rules there is that it will only match the "first" array element that matches the query conditions.

In order to "filter" for multiple array elements, proceed as follows:

db.sample.aggregate([
    // Filter possible documents
    { "$match": { "filtermetric.class": "s2" } },

    // Unwind the array to denormalize
    { "$unwind": "$filtermetric" },

    // Match specific array elements
    { "$match": { "filtermetric.class": "s2" } },

    // Group back to array form
    { "$group": {
        "_id": "$_id",
        "filtermetric": { "$push": "$filtermetric" }
    }}
])

In modern versions of MongoDB that are version 2.6 or greater you can do this with $redact :

db.sample.aggregate([
    // Filter possible documents
    { "$match": { "filtermetric.class": "s2" } },

    // Redact the entries that do not match
    { "$redact": {
        "$cond": [
            { "$eq": [ { "$ifNull": [ "$class", "s2" ] }, "s2" ] },
            "$$DESCEND",
            "$$PRUNE"
        ]
    }}
])

That is probably your most efficient option, but it is recursive so consider your document structure first as the same named field cannot exist with any other condition at any level.

Possibly safer but only useful where the results in the array are "truly unique" is this technique with $map and $setDifference:

db.sample.aggregate([
    { "$project": {
        "filtermetric": { "$setDifference": [
            { "$map": [
                "input": "$filtermetric",
                "as": "el",
                "in": {"$cond": [
                    { "$eq": [ "$$el.class", "s2" ] },
                    "$$el",
                    false
                ]}
            ]},
            [false]
        ]}
    }}
])

Also noting that in both the $group and $project operational pipeline stages you need to specify all of the fields you intend to return in your result documents from that stage.

The final note is that $elemMatch is not required when you are only querying the value of a single key within an array. "Dot notation" is preferred and recommended when only accessing a single key of the array. $elemMatch should only be needed when "multiple" keys in the document within the array "element" need to match a query condition.