Duplicate a mongodb collection

2019-08-11 12:16发布

问题:

What is a proper way to duplicate a collection in Mongodb on the same server using C#? MongoVUE has an option 'Duplicate collection', is there something similar for C#?

回答1:

There isn't a built-in way to copy collections with the C# driver, but you can still do it pretty simply as:

var source = db.GetCollection("test");
var dest = db.GetCollection("testcopy");
dest.InsertBatch(source.FindAll());

Note, however, that this won't copy any indexes from the source collection. The shell's copyTo method has the same limitation so it's likely implemented similarly.



回答2:

I had the exact same problem, but while the accepted answer works, I also needed to make it as fast as possible.

The fastest way to copy a collection is apparently using an aggregate with an $out pipeline stage. This way, you won't have to download all the documents and then re-upload them, they are just copied inside the database.

This is trivial to execute inside the mongo shell:

db.originalColl.aggregate([ { $match: {} }, { $out: "resultColl"} ]);

However, I had a lot of trouble running this from C#. Since eval has now been deprecated, you can't just stuff the above in a string to be executed on the server. Instead you need to construct a Bson document that represents the above code.

Here's how I made it work:

var aggDoc = new Dictionary<string,object>
{
    {"aggregate" , "originalCollection"},
    {"pipeline", new []
        {
            new Dictionary<string, object> { { "$match" , new BsonDocument() }},
            new Dictionary<string, object> { { "$out" , "resultCollection"}}
        }
    }
};

var doc = new BsonDocument(aggDoc);
var command = new BsonDocumentCommand<BsonDocument>(doc);
db.RunCommand(command);

This turns out to be very fast (about 3 minutes to copy 5M documents), and no data is transferred between the db and the application running the above code. One drawback is that it creates a temporary collection, so the resultCollection will be empty (or not existing) until the operation completes. So if you have a progress bar that is based on the size of the resultCollection it will no longer work.