Why Cassandra COUNT(*) on a specific partition tak

2019-07-28 02:24发布

I have a table defined like:

Keyspace:

CREATE KEYSPACE messages WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '1'} AND durable_writes = true;

Table:

CREATE TABLE messages.textmessages (
    categoryid int,
    date timestamp,
    messageid timeuuid,
    message text,
    userid int,
    PRIMARY KEY ((categoryid, date), messageid)
) WITH CLUSTERING ORDER BY (messageid ASC);

The goal is to have a wide row time-series storage such that categoryid and date(beginning of day) constitutes my partition key and the messageid provides the clustering. This enables me to do queries like:

SELECT * FROM messages.textmessages WHERE categoryid=2 AND date='2019-05-14 00:00:00.000+0300' AND messageId > maxTimeuuid('2019-05-14 00:00:00.000+0300') AND messageId < minTimeuuid('2019-05-15 00:00:00.000+0300')

to get messages in a given day; it works so well so fast!

Problem

I need to be able to count the messages in a given day by substituting SELECT * above with SELECT COUNT(*). This takes very long even with a little less than 100K entries in the column family; it actually times out on cqlsh.

I have read and understood quite a bit why COUNT is an expensive operation for a distributed database like Cassandra in Counting keys? Might as well be counting stars

Question

Why would this query take so long even when:

SELECT COUNT(*) FROM messages.textmessages WHERE categoryid=2 AND date='2019-05-14 00:00:00.000+0300' AND messageId > maxTimeuuid('2019-05-14 00:00:00.000+0300') AND messageId < minTimeuuid('2019-05-15 00:00:00.000+0300')
  1. The count is on a specific partition with less than 100K records
  2. I have only one Cassandra node on a performant Macbook Pro
  3. No active writes/reads in the instance; less than 20 partitions on development laptop

1条回答
我想做一个坏孩纸
2楼-- · 2019-07-28 03:03

This is understandably caused by a common pitfall when the concept of 'everything-is-a-write' in Cassandra is overlooked and thence why tombstones happen.

When executing a scan, within or across a partition, we need to keep the tombstones seen in memory so we can return them to the coordinator, which will use them to make sure other replicas also know about the deleted rows. With workloads that generate a lot of tombstones, this can cause performance problems and even exhaust the server heap.

Thanks to @JimWartnick's suggestion on possible tombstone related latency; this was casued by overwhelming amount of tombstones generated by my inserts that had NULL fields. I did not expect this to cause tombstones, neither did I expect tombstones to be a big deal in query performance; especially the COUNT.

Solution

  1. Use default unset values in the fields when not present or omit them altogether in the inserts/updates
  2. Be cognisant of the below facts as outlined by Common Problems with Cassandra Tombstones - Alla Babkina

One common misconception is that tombstones only appear when the client issues DELETE statements to Cassandra. Some developers assume that it is safe to choose a way of operations which relies on Cassandra being completely tombstone free. In reality there are other many other things causing tombstones apart from issuing DELETE statements. Inserting null values, inserting collections and expiring data using TTL are common sources of tombstones.

查看更多
登录 后发表回答