I am using Map Reduce. The problem is that if the no. input of documents is > 100, then I am not getting the expected no. of results but if the no. of input documents is <= 100, then I am getting the results as expected.
Sample output that I am getting:
{
"_id" : "5504",
"value" : [
ObjectId("51c921bae4b0f0f776b339d2"),
ObjectId("51b06b5be4b021e44bc69755")
]
}
Problem: If there are <= 100 documents for user (id:5504), then I am getting that many no. of ids in the output array but if the no. of documents >100, then I am getting very few ids in the output array. I got the above output when the no. of documents for this user was 101, but when it was 100, I got 100 ids. Why this strange behaviour and what's the solution for this?
Map Function:
db.system.js.save({
_id: "map1",
value: function () {
var value = {
"data": [{
"_id": this._id,
"creation_time": this.creation_time
}]
};
emit(this.user_id, value);
}
});
Reduce Function:
db.system.js.save({
_id: "reduce1",
value: function (key, values) {
var reducedValue = [];
for (var i = 0; i < values.length; i++) {
reducedValue.push({
"_id": values[i].data[0]._id,
"creation_time": values[i].data[0].creation_time
});
}
return {
data: reducedValue
};
}
});
Finalize Function:
db.system.js.save({
_id: "finalize1",
value: function (key, reducedValue) {
var a = reducedValue.data.sort(compare1);
var ids = [];
for (var i = 0; i < a.length; i++) {
ids.push(a[i]._id);
}
return ids;
}
});
Compare Function:
db.system.js.save({
_id: "compare1",
value: function (a, b) {
if (a.creation_time < b.creation_time) return 1;
if (a.creation_time > b.creation_time) return -1;
return 0;
}
});
MapReduce() call
db.notifications.mapReduce(map1, reduce1, {out: "notifications_result", query: {delivered:true, user_id:"5504"}, finalize: finalize1});
Since MongoDB could call reduce function many times, you must ensure Function Idempotence. A little modification on your reduce function solves the problem:
Note that now the
values[i].data
array is traversed too, because the return of otherreduce1
calls are in thevalues
array.