In the situation where there is an array nested within an array, how would I count the number of a specific value? For example, I want to count the number of "answers" in the document below. query should return that there are 2 apples and 1 banana.
{
"_id" : ObjectId("52c1d909fc7fc68ddd999a73"),
"name" : "Some survey",
"questions" : [
{
"_id" : ObjectId("52c1e250fc7fc68ddd999a75"),
"answers" :
[
{
"userId" : "some GUIDs",
"answer" : "apple"
},
{
"userId" : "some GUID",
"answer" : "apple"
},
{
"userId" : "some GUID",
"answer" : "banana"
}
],
"questionText" : "blah blah blah...",
"questionType" : "multiple choice"
}
]
}
There are a few ways to approach this depending on how much data you need to process. You could use the Aggregation Framework in MongoDB 2.2+, or possibly Map/Reduce. See Aggregation Commands Comparison for a summary of the features and limitations.
Here's an example using the Aggregation Framework:
db.fruit.aggregate(
// Limit matching documents (can take advantage of index)
{ $match: {
"_id" : ObjectId("52c1d909fc7fc68ddd999a73")
}},
// Unpack the question & answer arrays
{ $unwind: "$questions" },
{ $unwind: "$questions.answers" },
// Group by the answer values
{ $group: {
_id: "$questions.answers.answer",
count: { $sum: 1 }
}}
)
For your sample document this returns:
{
"result" : [
{
"_id" : "banana",
"count" : 1
},
{
"_id" : "apple",
"count" : 2
}
],
"ok" : 1
}
Here is one way to skin your cat using the aggregate framework. Once you learn it you can do a lot of good stuff with you data.
db.so.aggregate([{$unwind:"$questions"}, {$unwind:"$questions.answers"}, {$group:{_id:"$questions.answers.answer", fruit_count:{$sum:1}}}])
Gives me this:
{
"result" : [
{
"_id" : "banana",
"fruit_count" : 1
},
{
"_id" : "apple",
"fruit_count" : 2
}
],
"ok" : 1
To be doubly sure, I added this doc:
db.so.insert({
"_id" : ObjectId("52c1d909fc7fc68ddd999a75"),
"name" : "Some survey",
"questions" : [
{
"_id" : ObjectId("52c1e250fc7fc68ddd999a75"),
"answers" :
[
{
"userId" : "some GUIDs",
"answer" : "orange"
},
{
"userId" : "some GUID",
"answer" : "orange"
},
{
"userId" : "some GUID",
"answer" : "banana"
}
],
"questionText" : "blah blah blah...",
"questionType" : "multiple choice"
}
]
})
My query now gives me:
{
"result" : [
{
"_id" : "orange",
"fruit_count" : 2
},
{
"_id" : "banana",
"fruit_count" : 2
},
{
"_id" : "apple",
"fruit_count" : 2
}
],
"ok" : 1
}