What is the difference between a secondary index a

2019-03-16 07:49发布

When I read about these two, I thought both of them are explaining the same approach, I googled but found nothing. Is the difference in implementation? Cassandra does the secondary index itself but inverted index has to be implemented by myself?

Which is faster in searching, by the way?

标签： search indexing cassandra inverted-index

1条回答

做个烂人

2楼-- · 2019-03-16 08:29

The main difference is that secondary indexes in Cassandra are not distributed in the same way a manual inverted index would be. With the inbuilt secondary indexes, each node indexes the data it stores locally (using the LocalPartitioner). With manual indexing, the indexes are distributed independently of the nodes that store the values.

This means that, for the inbuilt indexes, each query must go to each node, whereas if you did inverted indexing manually you would just go to one node (plus replicas) to query the value you were looking up. One advantage of having the index stored locally is that indexes can be updated atomically with the data. (Although, since Cassandra 1.2, the atomic batches could be used for this instead although they are a bit slower.)

This is why Cassandra indexes are not recommended for really high cardinality data. If you are doing a lookup on each node but there are only one or two results, it is inefficient and a manual inverted index will be better. If your lookup returns many results, then you will need to lookup on each node anyway so the inbuilt indexes work well.

A further advantage of using Cassandra's inbuilt indexing is that the indexes are updated lazily, so you don't need to do a read on every update. (See CASSANDRA-2897.) This can be a significant speed improvement for indexed tables with high write throughput.

0人赞添加讨论(0) 举报

What is the difference between a secondary index a

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间