timeouts on ReadRepairStage error messages

We are using Apache Cassandra 3.11.4 .Recently we are seeing overloaded readrepair ERROR messages in the entire cluster because that we are getting timeouts ..I'm not able to find the root cause for this . Appreciate any inputs on this issue ..

ERROR [ReadRepairStage:2537] 2019-07-18 17:08:15,119 CassandraDaemon.java:228 - Exception in thread Thread[ReadRepairStage:2537,5,main] org.apache.cassandra.exceptions.ReadTimeoutException: Operation timed out - received only 1 responses. at org.apache.cassandra.service.DataResolver$RepairMergeListener.close(DataResolver.java:202) ~[apache-cassandra-3.11.3.jar:3.11.3] at org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$2.close(UnfilteredPartitionIterators.java:175) ~[apache-cassandra-3.11.3.jar:3.11.3] at org.apache.cassandra.db.transform.BaseIterator.close(BaseIterator.java:92) ~[apache-cassandra-3.11.3.jar:3.11.3] at org.apache.cassandra.service.DataResolver.compareResponses(DataResolver.java:79) ~[apache-cassandra-3.11.3.jar:3.11.3] at org.apache.cassandra.service.AsyncRepairCallback$1.runMayThrow(AsyncRepairCallback.java:50) ~[apache-cassandra-3.11.3.jar:3.11.3] at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) ~[apache-cassandra-3.11.3.jar:3.11.3] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[na:1.8.0_212] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[na:1.8.0_212] at org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81) ~[apache-cassandra-3.11.3.jar:3.11.3] at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_212]

reduced dclocalreadrepair to 0.0

标签： cassandra datastax-enterprise

1条回答

Rolldiameter

2楼-- · 2019-08-17 03:30

Timeouts are a common issue while attempting repairs, and without more specifics of the errors, or your configuration, this will be a shot in the dark.

Repairs depend on disk space, as it will create temporary copies of files, as a rule of thumb the disk utilization should be lower than or equal to 50% to ensure that you'll have enough space.
Repairs can be delayed/aborted if the cluster is stressed, if that is the case, you may need to scale up the cluster to increase the available resources.
You may want to take a look in these other recommendations from Aaron regarding updates of the JVM settings in repairs.

Also note that since Cassandra 3.11.3, the settings read_repair_chance and dc_read_repair_chance were removed, as their names were misleading with the result obtained. Adding them won't have any effect.

0人赞添加讨论(0) 举报

timeouts on ReadRepairStage error messages

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间