the map reduce examples I see use aggregation functions like count, but what is the best way to get say the top 3 items in each category using map reduce.
I'm assuming I can also use the group function but was curious since they state sharded environments cannot use group(). However, I'm actually interested in seeing a group() example as well.
For the sake of simplification, I'll assume you have documents of the form:
I've created 1000 documents covering 100 categories with:
Our mapper is pretty simple, just emit the category as key, and an object containing an array of scores as the value:
MongoDB's reducer cannot return an array, and the reducer's output must be of the same type as the values we
emit
, so we must wrap it in an object. We need an array of scores, as this will let our reducer compute the top 3 scores:Finally, invoke the map-reduce:
Now we have a collection containing one document per category, and the top 3 scores across all documents from
foo
in that category:(Your exact values may vary if you used the same
Math.random()
data generator as I have above)You can now use this to query
foo
for the actual documents having those top scores:}
This code won't handle ties, or rather, if ties exist, more than 3 documents might be returned in the final cursor produced by
find_top_scores
.The solution using
group
would be somewhat similar, though the reducer will only have to consider two documents at a time, rather than an array of scores for the key.