Find largest document size in MongoDB

2019-01-17 03:30发布

问题:

Is it possible to find the largest document size in MongoDB?

db.collection.stats() shows average size, which is not really representative because in my case sizes can differ considerably.

回答1:

You can use a small shell script to get this value.

Note : Following will do a full table scan

var max = 0;
db.test.find().forEach(function(obj) {
    var curr = Object.bsonsize(obj); 
    if(max < curr) {
        max = curr;
    } 
})
print(max);


回答2:

Note: this will attempt to store the whole result set in memory (from .toArray) . Careful on big data sets. Do not use in production! Abishek's answer has the advantage of working over a cursor instead of across an in memory array.

If you also want the _id, try this. Given a collection called "requests" :

// Creates a sorted list, then takes the max
db.requests.find().toArray().map(function(request) { return {size:Object.bsonsize(request), _id:request._id}; }).sort(function(a, b) { return a.size-b.size; }).pop();

// { "size" : 3333, "_id" : "someUniqueIdHere" }


回答3:

If you're working with a huge collection, loading it all at once into memory will not work, since you'll need more RAM than the size of the entire collection for that to work.

Instead, you can process the entire collection in batches using the following package I created: https://www.npmjs.com/package/mongodb-largest-documents

All you have to do is provide the MongoDB connection string and collection name. The script will output the top X largest documents when it finishes traversing the entire collection in batches.



标签: mongodb