We are using Apache Cassandra 3.11.4 .Recently we are seeing overloaded readrepair ERROR messages in the entire cluster because that we are getting timeouts ..I'm not able to find the root cause for this . Appreciate any inputs on this issue ..
ERROR [ReadRepairStage:2537] 2019-07-18 17:08:15,119 CassandraDaemon.java:228 - Exception in thread Thread[ReadRepairStage:2537,5,main] org.apache.cassandra.exceptions.ReadTimeoutException: Operation timed out - received only 1 responses. at org.apache.cassandra.service.DataResolver$RepairMergeListener.close(DataResolver.java:202) ~[apache-cassandra-3.11.3.jar:3.11.3] at org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$2.close(UnfilteredPartitionIterators.java:175) ~[apache-cassandra-3.11.3.jar:3.11.3] at org.apache.cassandra.db.transform.BaseIterator.close(BaseIterator.java:92) ~[apache-cassandra-3.11.3.jar:3.11.3] at org.apache.cassandra.service.DataResolver.compareResponses(DataResolver.java:79) ~[apache-cassandra-3.11.3.jar:3.11.3] at org.apache.cassandra.service.AsyncRepairCallback$1.runMayThrow(AsyncRepairCallback.java:50) ~[apache-cassandra-3.11.3.jar:3.11.3] at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) ~[apache-cassandra-3.11.3.jar:3.11.3] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[na:1.8.0_212] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[na:1.8.0_212] at org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81) ~[apache-cassandra-3.11.3.jar:3.11.3] at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_212]
reduced dclocalreadrepair to 0.0
Timeouts are a common issue while attempting repairs, and without more specifics of the errors, or your configuration, this will be a shot in the dark.
Also note that since Cassandra 3.11.3, the settings
read_repair_chance
anddc_read_repair_chance
were removed, as their names were misleading with the result obtained. Adding them won't have any effect.