Why is my dormant cassandra dc seeing reads?

Steven Lacerda
2 min readOct 27, 2020
Client request image

I have seen an issue where a DC that is not supposed to be receiving traffic is still showing reads. Why is that?

Well, of course we expect writes based on the replication factor, that’s to be expected. However, reads should only be handled based on the drivers configuration. So, there could be a couple of issues, one is that the driver is configured incorrectly, and you may be using something like a round robin load balancing policy, rather than a dc aware round robin load balancing policy. Or, perhaps the contact points list a node in the dormant dc.

However, in our case it was neither of the above. What we found were reads were being generated through the tables default speculative retry of 99PERCENTILE. That means for at least 1% of queries, speculative retry occurs. Here’s an example table:

CREATE TABLE demo.users (
id text,
isactive boolean,
isprivateactive boolean,
lastmodified timestamp,
primarysites list<text>,
privateid text,
publics map<text, boolean>,
secondarysites list<text>,
PRIMARY KEY (id)
) WITH read_repair_chance = 0.0
AND dclocal_read_repair_chance = 0.0
AND gc_grace_seconds = 864000
AND bloom_filter_fp_chance = 0.1
AND caching = { 'keys' : 'ALL', 'rows_per_partition' : 'ALL' }
AND comment = ''
AND compaction = { 'class' : 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy', 'max_threshold' : 32, 'min_threshold' : 4 }
AND compression = { 'chunk_length_in_kb' : 64, 'class' : 'org.apache.cassandra.io.compress.LZ4Compressor' }
AND default_time_to_live = 0
AND speculative_retry = '99PERCENTILE'
AND min_index_interval = 128
AND max_index_interval = 2048
AND crc_check_chance = 1.0
AND cdc = false
AND memtable_flush_period_in_ms = 0;

Notice the speculative_retry setting. That increases the read by 1 each time, and is not dc specific.

There are different options if you know what you’re doing. You could make the option based more on your query times, and not random chance. Here are some options:

  • ALWAYS: Send extra read requests to all other replicas after every read.
  • Xpercentile: Cassandra constantly tracks each table's typical read latency (in milliseconds). If you set speculative retry to Xpercentile, Cassandra sends redundant read requests if the coordinator has not received a response after X percent of the table's typical latency time.
  • Nms: Send extra read requests to all other replicas if the coordinator node has not received any responses within N milliseconds.
  • NONE: Do not send extra read requests after any read.

I hope that helps.

By Steve Lacerda

--

--

Steven Lacerda

Steve Lacerda is a software engineer specializing in web development. His favorite 80’s song is Let’s Put the X in Sex by Kiss.