现象
DEBUG [ReadRepairStage:636754] 2017-05-30 14:49:44,259 ReadCallback.java:234 - Digest mismatch:
org.apache.cassandra.service.DigestMismatchException: Mismatch for key DecoratedKey(4329955402556695061, 00080000000000000844000008000001579b425c4000) (343b7ef24feb594118ecb4bf7680d07f vs d41d8cd98f00b204e9800998ecf8427e)
at org.apache.cassandra.service.DigestResolver.resolve(DigestResolver.java:85) ~[apache-cassandra-3.0.9.jar:3.0.9]
at org.apache.cassandra.service.ReadCallback$AsyncRepairRunner.run(ReadCallback.java:225) ~[apache-cassandra-3.0.9.jar:3.0.9]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_101]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_101]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_101]
分析及解决
读修复涉及到两个配置:read_repair_chance、dclocal_read_repair_chance,设置为0的时候也是无法完全禁用读修复的,从cassandra3.11.3开始,以上两个参数被移除,默认不进行读修复。下面是对此问题的解释:
But sometimes I use QUORUM instead of ONE. Read QUORUM is reached
when two replicas respond, but the two records can be different.
Cassandra return the one with the most recent timestamp.
Does it try to repair the other one?
Assuming that my RF is 3 and write and read CL are QUORUM.
What I mean saying that 'wanted CL is reached' is that two replicas
at least respond, this is QUORUM. But nothing proves that the two
records are identical. Cassandra returns the one with the most
recent timestamp. But my question is does it try to repair the
other record?
The answer is YES.
Cassandra will try to repair the other one, even though
read_repair_chance = 0 and dclocal_read_repair_chance = 0
It is called 'digest mismatch'. The only way to avoid read repair
is reading at LOCAL_ONE or ONE where no digest mismatch can occur.
See https://issues.apache.org/jira/browse/cassandra-13910,
https://issues.apache.org/jira/browse/cassandra-11409,
https://issues.apache.org/jira/browse/cassandra-13863,
http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html