Search and replace in Mongodb?

2019-03-05 00:47发布

问题:

Given a set of 100 posts, and each post having a a body attribute with post content and inside that content there are image url's like "http://example.com/wp-content/uploads/5.jpg"

Is there a way to go through each post's body content and then look for anything that matches the "http://example.com/wp-content/uploads/5.jpg" and replace it with something like "http://amazon-bucket.aws.com/wp-content/uploads/5.jpg"

Thanks!

回答1:

Not exactly, and by that I mean if you were not looking for the "exact string" and wanting to always replace with the "same" different string.

Essentially it looks like you are looking for a "regex replace" for documents that can be performed via .update(). While it is possible to $regex search, there is no "capture" or option to feed captured portions to the "update" part of a statement such as $set.

So in order to do this sort of update, you need to loop your documents and modify in code. But the Bulk Operations API can be of some assistance here:

var bulk = db.collection.initializeOrderedBulkOp();
var counter = 0;

var query = { "url": { "$regex": "^http://example\.com" }};
db.collection.find(query).forEach(function(doc) {

    // Inspect and replace the part of the string
    bulk.find({ "_id": doc._id }).updateOne(
        { "$set": { "url": doc.url.replace("example.com","bucket.aws.com") } }
    );
    counter++;

    // Update once every 1000 documents
    if ( counter % 1000 == 0 ) {
        bulk.execute();
        bulk = db.collection.initializeOrderedBulkOp();
    }

})

// Process any remaining
if ( counter % 1000 != 0 )
    bulk.execute();

So that still requires looping but at least the updates are only sent to the server once every 1000 documents processed.



回答2:

Though it is not recommended, I think this is one of the rare use cases valid for MongoDBs server side JavaScript feature in case you need that regularly.

The advantage of this is that you don't have to transfer the documents back and forth, but change them on the server. And you don't even have to implement a triggering logic, you simply can call your server side JS function with a cronjob and --eval.