How are Cassandra's 0.7 Secondary Indexes stor

2019-06-28 03:03发布

问题:

We have been using Cassandra 0.6 and now have Column Families with millions of keys. We are interested in using the new Secondary Index feature available in the 0.7 but couldn't find any documentation on how the new index is stored.

Is there any disk-space limitation or is the index stored similar to keys in that it's spread over multiple nodes?

I've tried combing through the Cassandra site for an answer but to no avail.

回答1:

Secondary indexes are stored as Column Families that are not accessible by the user. Their size will roughly be:

(cardinality of the set of indexed values * the avg size of the index values) + (the number of keys in the indexed column family * the avg size of keys in the column family).

Nodes only index rows that are stored locally -- that is, only rows for which they are a replica.