I have a huge collection of documents and I want to extract some statistics on it. It needs to be executed periodically each 15 minutes.
Most of the stats are based on the document size, so I need to fetch the documents and calculate its size.
The output of my stats is just a single line with some stats, regarding the size of the documents. (I am not fetching a whole collection, just a subset of it, so I cannot use the collection stats provided by mongodb)
What I'd like is to make this execution on the server side, and avoid transferring all the documents to the client side, (just because I need to calculate the size).
I am executing it with mongo shell, making sure I am connecting to a secondary, and this mongo shell is always running in a remote machine, so this is the main reason to avoid transferring all documents through the network.
After reading the mongo shell documentation I expected it to be executed "server-side" as it states, but it is not working this way and it is being executed in the same machine as the mongo shell (which is more client-side than server-side in my opinion).
I am pasting an extract of my code just in case it helps :
db.cache.find(query).forEach(function(obj) {
var curr = Object.bsonsize(obj);
if(stats.max < curr) {
stats.max = curr;
stats.maxid = obj._id;
}
if(stats.min > curr) {
stats.min = curr;
}
stats.count++;
stats.total += curr;
stats.avg = stats.total/stats.count;
})
It takes like 3-4 seconds if I execute mongo shell locally and more than 1 minute in a mongo shell executed remotely.
Any ideas how to make this server side javascript be a real server side execution?
UPDATE:
To summarize the options mentioned in the answer :
use
system.js
collection +db.eval
: I cannot use it becauseeval
is deprecated, but alsoeval
needs to run on the master, and I have to run it on a secondary.use
system.js
collection +loadServerScripts
: It executes the javascript code in the mongo shell machine, which is the "client".cron job : I'd need to run it on a specific node, and as master may change to another node, I can end up running it against the master which I should avoid. But also, I am not allowed to do so, one of the requirements is to run it on a remote shell. (There are several dbs like these one that will need this kind of statistics, and it is easier to mantain having it only in one place).
You could store js code as a kind of stored procedure.
As per this article you can store js as a system call:
then call it like:
extra documentation here
other solution to
eval
is is to have acron job
calling a javascript file lunched locally on server