MongoDB balancer timeout with delayed replica

We have a setup of two mongodb shards. Each shard contains a master, a slave, a 24h slave delay slave and an arbiter. However the balancer fails to migrate any shards waiting for the delayed slave to migrate. I have tried setting _secondaryThrottle to false in the balancer config, but I still have the issue.

It seems the migration goes on for a day and then fails (A ton of waiting for slave messages in the logs). Eventually it gives up and starts a new migration. The message says waiting for 3 slaves, but the delay slave is hidden and prio 0 so it should wait for that one. And if the _secondaryThrottle worked it should not wait for any slave right?

It's been like this for a few months now so the config should have been reloaded on all mongoses. Some of the mongoses running the balancer have been restarter recently.

Does anyone have any idea how to solve the problem, we did not have these issues before starting the delayed slave, but it's just our theory.

Config:

{ "_id" : "balancer", "_secondaryThrottle" : false, "stopped" : false }

Log from shard1 master process:

[migrateThread] warning: migrate commit waiting for 3 slaves for 'xxx.xxx' { shardkey: ObjectId('4fd2025ae087c37d32039a9e') } -> {shardkey: ObjectId('4fd2035ae087c37f04014a79') } waiting for: 529dc9d9:7a [migrateThread] Waiting for replication to catch up before entering critical section

Log from shard2 master process:

Tue Dec 3 14:52:25.302 [conn1369472] moveChunk data transfer progress: { active: true, ns: "xxx.xxx", from: "shard2/mongo2:27018,mongob2:27018", min: { shardkey: ObjectId('4fd2025ae087c37d32039a9e') }, max: { shardkey: ObjectId('4fd2035ae087c37f04014a79') }, shardKeyPattern: { shardkey: 1.0 }, state: "catchup", counts: { cloned: 22773, clonedBytes: 36323458, catchup: 0, steady: 0 }, ok: 1.0 } my mem used: 0

Update: I confirmed that removing slaveDelay got the balancer working again. As soon as they got up to speed chunks moved. So the problem seems to be related to the slaveDelay. I also confirmed that the balancer runs with "secondaryThrottle" : false. It does seem to wait for slaves anyway.

Shard2:

Tue Dec 10 11:44:25.423 [migrateThread] warning: migrate commit waiting for 3 slaves for 'xxx.xxx' { shardkey: ObjectId('4ff1213ee087c3516b2f703f') } -> { shardkey: ObjectId('4ff12a5eddf2b32dff1e7bea') } waiting for: 52a6f089:81

Tue Dec 10 11:44:26.423 [migrateThread] Waiting for replication to catch up before entering critical section

Tue Dec 10 11:44:27.423 [migrateThread] Waiting for replication to catch up before entering critical section

Tue Dec 10 11:44:28.423 [migrateThread] Waiting for replication to catch up before entering critical section

Tue Dec 10 11:44:29.424 [migrateThread] Waiting for replication to catch up before entering critical section

Tue Dec 10 11:44:30.424 [migrateThread] Waiting for replication to catch up before entering critical section

Tue Dec 10 11:44:31.424 [migrateThread] Waiting for replication to catch up before entering critical section

Tue Dec 10 11:44:31.424 [migrateThread] migrate commit succeeded flushing to secondaries for 'xxx.xxx' { shardkey: ObjectId('4ff1213ee087c3516b2f703f') } -> { shardkey: ObjectId('4ff12a5eddf2b32dff1e7bea') }

Tue Dec 10 11:44:31.425 [migrateThread] migrate commit flushed to journal for 'xxx.xxx' { shardkey: ObjectId('4ff1213ee087c3516b2f703f') } -> { shardkey: ObjectId('4ff12a5eddf2b32dff1e7bea') }

Tue Dec 10 11:44:31.647 [migrateThread] migrate commit succeeded flushing to secondaries for 'xxx.xxx' { shardkey: ObjectId('4ff1213ee087c3516b2f703f') } -> { shardkey: ObjectId('4ff12a5eddf2b32dff1e7bea') }

Tue Dec 10 11:44:31.667 [migrateThread] migrate commit flushed to journal for 'xxx.xxx' { shardkey: ObjectId('4ff1213ee087c3516b2f703f') } -> { shardkey: ObjectId('4ff12a5eddf2b32dff1e7bea') }

回答1:

The balancer is properly waiting for the MAJORITY of the replica set of the destination shard to have the documents being migrated before initiating the delete of those documents on the source shard.

The issue is that you have FOUR members in your replica set (master, a slave, a 24h slave delay slave and an arbiter). That means three is the majority. I'm not sure why you added an arbiter, but if you remove it, then TWO will be the majority and the balancer will not have to wait for the delayed slave.

The alternate way of achieving the same result is to set up the delayed slave with votes:0 property and leave the arbiter as the third voting node.

回答2:

What version are you running? There is a known bug in 2.4.2 and below, as well as 2.2.4 and below that causes an incorrect count of the number of secondaries in the set (and hence makes it impossible to satisfy the default w:majority write for the migration). This is the bug (fixed in 2.4.3+ and 2.2.5+):

https://jira.mongodb.org/browse/SERVER-8420

Turning off the secondary throttle should be a valid workaround, but you may want to do a flushRouterConfig on any mongos processes (or just restart all the mongos processes) to make sure the setting is taking effect for your migrations, especially if they are taking a day to time out. As another potential fix prior to upgrade, you can also drop the local.slaves collection (it will be recreated).