Here is my MongoDB collection schema:
company: String
model: String
cons: [String] // array of tags that were marked as "cons"
pros: [String] // array of tags that were marked as "pros"
I need to aggregate it so I get the following output:
[{
"_id": {
"company": "Lenovo",
"model": "T400"
},
"tags": {
tag: "SomeTag"
pros: 124 // number of times, "SomeTag" tag was found in "pros" array in `Lenovo T400`
cons: 345 // number of times, "SomeTag" tag was found in "cons" array in `Lenovo T400`
}
}...]
I tried to do the following:
var aggParams = {};
aggParams.push({ $unwind: '$cons' });
aggParams.push({ $unwind: '$pros' });
aggParams.push({$group: {
_id: {
company: '$company',
model: '$model',
consTag: '$cons'
},
consTagCount: { $sum: 1 }
}});
aggParams.push({$group: {
_id: {
company: '$_id.company',
model: '$_id.model',
prosTag: '$pros'
},
prosTagCount: { $sum: 1 }
}});
aggParams.push({$group: {
_id: {
company:'$_id.company',
model: '$_id.model'
},
tags: { $push: { tag: { $or: ['$_id.consTag', '$_id.prosTag'] }, cons: '$consTagCount', pros: '$prosTagCount'} }
}});
But I got the following result:
{
"_id": {
"company": "Lenovo",
"model": "T400"
},
"tags": [
{
"tag": false,
"pros": 7
}
]
}
What is the right way to do this with aggregation
?
Yes this is a bit harder considering that there are multiple arrays, and if you try both at the same time you end up with a "cartesian condition" where one arrray multiplies the contents of the other.
Therefore, just combine the array content at the beginning, which probably indicates how you should be storing the data in the first place:
Model.aggregate(
[
{ "$project": {
"company": 1,
"model": 1,
"data": {
"$setUnion": [
{ "$map": {
"input": "$pros",
"as": "pro",
"in": {
"type": { "$literal": "pro" },
"value": "$$pro"
}
}},
{ "$map": {
"input": "$cons",
"as": "con",
"in": {
"type": { "$literal": "con" },
"value": "$$con"
}
}}
]
}
}},
{ "$unwind": "$data" }
{ "$group": {
"_id": {
"company": "$company",
"model": "$model",
"tag": "$data.value"
},
"pros": {
"$sum": {
"$cond": [
{ "$eq": [ "$data.type", "pro" ] },
1,
0
]
}
},
"cons": {
"$sum": {
"$cond": [
{ "$eq": [ "$data.type", "con" ] },
1,
0
]
}
}
}
],
function(err,result) {
}
)
So via the first $project
stage the $map
operators are adding the "type" value to each item of each array. Not that it really matters here as all items should process "unique" anyway, the $setUnion
operator "contatenates" each array into a singular array.
As mentioned earlier, you probably should be storing in this way in the first place.
Then process $unwind
followed by $group
, wherein each "pros" and "cons" is then evaluated via $cond
to for it's matching "type", either returning 1
or 0
where the match is respectively true/false
to the $sum
aggregation accumulator.
This gives you a "logical match" to count each respective "type" within the aggregation operation as per the grouping keys specified.