Leveled Compaction Strategy with low disk space

2019-06-12 15:01发布

We have Cassandra 1.1.1 servers with Leveled Compaction Strategy.

The system works so that there are read and delete operations. Every half a year we delete approximately half of the data while new data comes in. Sometimes it happens that disk usage goes up to 75% while we know that real data take about 40-50% other space is occupied by tombstones. To avoid disk overflow we force compaction of our tables by dropping all SSTables to Level 0. For that we remove .json manifest file and restart Cassandra node. (gc_grace option does not help since compaction starts only after level is filled)

Starting from Cassandra 2.0 the manifest file was moved to sstable file itself: https://issues.apache.org/jira/browse/CASSANDRA-4872

We are considering migration to Cassandra 2.x while we afraid we won't have such a possibility as forcing leveled compaction any more.

My question is: how could we achieve that our table has a disk space limit e.g. 150GB? (When the limit is exceeded it triggers compaction automatically). The question is mostly about Cassandra 2.x. While any alternative solutions for Cassandra 1.1.1 are also welcome.

1条回答
SAY GOODBYE
2楼-- · 2019-06-12 15:42

It seems like I've found the answers myself.

  • There is tool sstablelevelreset starting from 2.x version which does similar level reset as deletion of manifest file. The tool is located in tools directory of Cassandra distribution e.g. apache-cassandra-2.1.2/tools/bin/sstablelevelreset.

  • Starting from Cassandra 1.2 (https://issues.apache.org/jira/browse/CASSANDRA-4234) there is tombstone removal support for Leveled Compaction Strategy which supports tombstone_threshold option. It gives the possibility of setting maximal ratio of tombstones in a table.

查看更多
登录 后发表回答