MongoDB Aggregation - match if value in array

2020-02-10 02:34发布

问题:

I have a collection that I'm performing an aggregation on and I've basically gotten it down to

{array:[1,2,3], value: 1},
{array:[1,2,3], value: 4}

How would I perform an aggregation match to check if the value is in the array? I tried using {$match: {"array: {$in: ["$value"]}}} but it doesn't find anything.

I would want the output (if using the above as an example) to be:

{array:[1,2,3], value:1}

回答1:

A slight variation based on @chridam's answer:

db.test.aggregate([
    { "$unwind": "$array" },
    { "$group": {
                  _id: { "_id": "$_id", "value": "$value" },
                  array: { $push: "$array" },
                  mcount: { $sum: {$cond: [{$eq: ["$value","$array"]},1,0]}}
                }
    },
    { $match: {mcount: {$gt: 0}}},
    { "$project": { "value": "$_id.value", "array": 1, "_id": 0 }}
])

The idea is to $unwind and $group back the array, counting in mcount the number of items matching the value. After that, a simple $match on mcount > 0 will filter out unwanted documents.



回答2:

As stated, $where is a good option where you do not need to continue the logic in the aggregation pipeline.

But if you do then use $redact, with $map to transform the "value" into an array and use of $setIsSubSet to compare. It is the fastest way to do this since you do not need to duplicate documents using $unwind:

db.collection.aggregate([
   { "$redact": {
       "$cond": {
           "if": { "$setIsSubset": [
                { "$map": {
                    "input": { "$literal": ["A"] },
                    "as": "a",
                    "in": "$value"
                }},
                "$array"
           ]},
           "then": "$$KEEP",
           "else": "$$PRUNE"
       }
   }}
])

The $redact pipeline operator allows the proccessing of a logical condition within $cond and uses the special operations $$KEEP to "keep" the document where the logical condition is true or $$PRUNE to "remove" the document where the condition was false.

This allows it to work like $project with a subsequent $match, but in a single pipeline stage which is more efficient.

Considering these are native coded operators and not JavaScript then it is likely "the" fastest way to perform your match. So provided you are using a MongoDB 2.6 version or above, then this is the way you should be doing it to compare these elements in your document.



回答3:

You can use aggregation expression in regular query in 3.6 version.

db.collection_name.find({"$expr": {"$in": ["$value", "$array"]}})

Using Aggregation:

You can use $match + $expr in current 3.6 version.

db.collection_name.aggregate({"$match": {"$expr": {"$in": ["$value", "$array"]}}})

You can try $redact + $in expression in 3.4 version.

db.collection_name.aggregate({
  "$redact": {
    "$cond": [
      {
        "$in": [
          "$value",
          "$array"
        ]
      },
      "$$KEEP",
      "$$PRUNE"
    ]
  }
})


回答4:

A more efficient approach would involve a single pipeline that uses the $redact operator as follows:

db.collection.aggregate([
    { 
        "$redact": {
            "$cond": [
                { 
                    "$setIsSubset": [ 
                        ["$value"],
                        "$array"  
                    ] 
                },
                "$$KEEP",
                "$$PRUNE"
            ]
        }
    }
])

For earlier versions of MongoDB that do not support $redact (versions < 2.6) then consider this aggregation pipeline that uses the $unwind operator:

db.collection.aggregate([
    { "$unwind": "$array" },
    {
        "$project": {
            "isInArray": {
                "$cond": [
                    { "$eq": [ "$array", "$value" ] },
                    1,
                    0
                ]
            },
            "value": 1,
            "array": 1
        }
    },
    { "$sort": { "isInArray": -1 } },
    {
        "$group": {
            "_id": {
                "_id": "$_id",
                "value": "$value"
            },
            "array": { "$push": "$array" },
            "isInArray": { "$first": "$isInArray" }
        }
    },
    { "$match": { "isInArray": 1 } },
    { "$project": { "value": "$_id.value", "array": 1, "_id": 0 } }
])


回答5:

A little late to answer but this presents another solution:

By using addFields and match separately, this gives more flexibility than the redact. You can expose several fields and then use other matching logic together based on the results.

db.applications.aggregate([
    {$addFields: {"containsValueInArray": {$cond:[{$setIsSubset: [["valueToMatch"], "$arrayToMatchIn"]},true,false]}}},
    {$match: {"containsValueInArray":true}}
]);


回答6:

You can use $where if aggregation not required

db.collection.find({ $where: function(){ 
    return (this.array.indexOf(this.value) !== -1)}
})


回答7:

Try the combination of $eq and $setIntersection

{$group :{
  _id: "$id",
  yourName :  { $sum:
  { $cond :[
       {$and : [
          {$eq:[{$setIntersection : ["$someArrayField", ["$value"]]  },["$value"]]}
         ]
      },1,0]
  }

} }



回答8:

i prefer without grouping, there's an easy approach since v.3.2

...aggregate([
      {
        $addFields: {
          arrayFilter: {
            $filter: {
              input: '$array',
              as: 'item',
              cond: ['$$item', '$value']
            }
          }
        }
      },
      {
        $unwind: '$arrayFilter'
      },
      {
        $project: {
          arrayFilter: 0
        }
      }
    ]);
  1. Add a temporary filter field
  2. $unwind on the resulting array (pipeline results with empty arrays get removed)
  3. (optional) remove filter field from result via project


回答9:

You can do it with simple $project & $match

db.test.aggregate([{
$project: {
              arrayValue: 1,
              value: 1,
              "has_same_value" : { $in: ["$value", "$arrayValue"] }
          }
},
{
   $match: {has_same_value: true}
},
{
  $project: {has_same_value: 0}
}])