I have next situation. I have two mongodb instances on different servers.
For example
Mongodb instance on server "one" (host1:27017) with database: "test1"
Mongodb instance on server "two" (host2:27017) with database: "test2"
Now, i need to synchronize "test1" database from "host1:27017" with
"test2" from "host2:27017".
By "synchronize" I mean next:
If some collection from "test1" database doesn't exist in "test2" then this collection should be full copied in "test1" database.
If some record from collection doesn't exist in "test2" database, then must be added otherwise updated. If record not exist in A collection in "test1" database, but exist in A collection in "test2" database, then record must be deleted from "test2".
By the way here is problem. For example:
"test1" database has collection "A" with the following documents:
{
_id: "1",
name: "some name"
}
"test2" database has collection "A" with the following documents:
{
_id: "1",
name: "some name"
}
{
_id: "2",
name: "some name2"
}
If I perform db.copyDatabase('test1', 'test2', "host2:27017") I get error:
"errmsg" : "exception: E11000 duplicate key error index: test1.A.$id dup key: { : \"1\" }"
Same with cloneDatabase
command. How I can resolve it ?
In general what are the ways to synchronize databases?
I know what the simplest way is just copy files from one server to second, but maybe there are better ways.
Please help. I'm newcomer in mongo. Thanks.
I haven't tried this, but the current MongoDB documents describe a replication set equivalent to master-slave replication:
Deploy Master-Slave Equivalent using Replica Sets
If you want a replication configuration that resembles master-slave replication, using replica sets, consider the following replica configuration document. In this deployment hosts and 1 provide replication that is roughly equivalent to a two-instance master-slave deployment:
{
_id : 'setName',
members : [
{ _id : 0, host : "<master>", priority : 1 },
{ _id : 1, host : "<slave>", priority : 0, votes : 0 }
]
}
See Replica Set Configuration for more information about replica set configurations.
Use _id instead of id. There is no need to declare it in your model.
if you have plenty of servers
I use on each server a small prehook which creates a controlled unique _id. The mongoose _id is built very logical (https://docs.mongodb.com/manual/reference/method/ObjectId/#ObjectIDs-BSONObjectIDSpecification), the digits 0,6 are the machine identifier. I just control these digits because I have multiple servers and I want to assure there is no collusion. If you have just a few, it is probably no risk to not do this. And even in my case I think it is too paranoid.
exports.useProcessId = ()->
return process.env.INSTANCE_PROCESS_ID? && process.env.INSTANCE_PROCESS_ID.length == 4
exports.manipulateMongooseId = (id) ->
id = id.toString()
newId = new ObjectId(id.slice(0,6) + process.env.INSTANCE_PROCESS_ID + id.slice(10,24))
return newId
schema
mymOdelSchema.pre('save', (next) ->
data = @
async.parallel
myModel: (next)->
myModelValidator.base(data, next)
changeMongooseId: (next)->
if useProcessId && instanceType == 'manager' then processIdConfig.changeMongooseId(data, next) else return next()
(err)->
return
next new Error(err) if err?
return next()
)