How to use unordered bulk inserting with Mongoskin

2020-02-13 03:58发布

问题:

I'm having trouble using Mongoskin to perform bulk inserting (MongoDB 2.6+) on Node.

var dbURI = urigoeshere;
var db = mongo.db(dbURI, {safe:true});
var bulk = db.collection('collection').initializeUnorderedBulkOp();

for (var i = 0; i < 200000; i++) {
    bulk.insert({number: i}, function() {
        console.log('bulk inserting: ', i);
    });
}

bulk.execute(function(err, result) {
    res.json('send response statement');
}); 

The above code gives the following warnings/errors:

(node) warning: possible EventEmitter memory leak detected. 51 listeners added. Use emitter.setMaxListeners() to increase limit.
TypeError: Object #<SkinClass> has no method 'execute'
(node) warning: possible EventEmitter memory leak detected. 51 listeners added. Use emitter.setMaxListeners() to increase limit.
TypeError: Object #<SkinClass> has no method 'execute'

Is it possible to use Mongoskin to perform unordered bulk operations? If so, what am I doing wrong?

回答1:

You can do it but you need to change your calling conventions to do this as only the "callback" form will actually return a collection object from which the .initializeUnorderedBulkOp() method can be called. There are also some usage differences to how you think this works:

var dbURI = urigoeshere;
var db = mongo.db(dbURI, {safe:true});
db.collection('collection',function(err,collection) {
    var bulk = collection.initializeUnorderedBulkOp();
    count = 0;

    for (var i = 0; i < 200000; i++) {
        bulk.insert({number: i});
        count++;

        if ( count % 1000 == 0 )
            bulk.execute(function(err,result) {
               // maybe do something with results
               bulk = collection.initializeUnorderedBulkOp(); // reset after execute
            });      

    });

    // If your loop was not a round divisor of 1000
    if ( count % 1000 != 0 )
        bulk.execute(function(err,result) {
          // maybe do something here
        });
});

So the actual "Bulk" methods themselves don't require callbacks and work exactly as shown in the documentation. The exeception is .execute() which actually sends the statements to the server.

While the driver will sort this out for you somewhat, it probably is not a great idea to queue up too many operations before calling execute. This basically builds up in memory, and though the driver will only send in batches of 1000 at a time ( this is a server limit as well as the complete batch being under 16MB ), you probably want a little more control here, at least to limit memory usage.

That is the point of the modulo tests as shown, but if memory for building the operations and a possibly really large response object are not a problem for you then you can just keep queuing up operations and call .execute() once.

The "response" is in the same format as given in the documentation for BulkWriteResult.