cassandra:sorting problem,ordering is wrong

2020-04-21 07:39发布

I have a question about Cassandra. At present, "entities_by_time" is ok on the 18-bit uuid through column1 sorting, but there is something wrong with uuid ascending to the 19-bit sorting. Please help me.

cqlsh:minds> select * from entities_by_time where key='activity:user:990192934408163330' order by column1 desc limit 10;
 key                              | column1            | value
----------------------------------+--------------------+--------------------
 activity:user:990192934408163330 | 999979571363188746 | 999979571363188746
 activity:user:990192934408163330 | 999979567064027139 | 999979567064027139
 activity:user:990192934408163330 | 999979562764865555 | 999979562764865555
 activity:user:990192934408163330 | 999979558465703953 | 999979558465703953
 activity:user:990192934408163330 | 999979554170736649 | 999979554170736649
 activity:user:990192934408163330 | 999979549871575047 | 999979549871575047
 activity:user:990192934408163330 | 999979545576607752 | 999979545576607752
 activity:user:990192934408163330 | 999979541290029073 | 999979541290029073
 activity:user:990192934408163330 | 999979536990867461 | 999979536990867461
 activity:user:990192934408163330 | 999979532700094475 | 999979532700094475

cqlsh:minds> select * from entities_by_time where key='activity:user:990192934408163330' order by column1 asc limit 10;

 key                              | column1             | value
----------------------------------+---------------------+---------------------
 activity:user:990192934408163330 | 1000054880351555598 | 1000054880351555598
 activity:user:990192934408163330 | 1000054884671688706 | 1000054884671688706
 activity:user:990192934408163330 | 1000054888966656017 | 1000054888966656017
 activity:user:990192934408163330 | 1000054893257429005 | 1000054893257429005
 activity:user:990192934408163330 | 1000054897552396308 | 1000054897552396308
 activity:user:990192934408163330 | 1000054901843169290 | 1000054901843169290
 activity:user:990192934408163330 | 1000054906138136577 | 1000054906138136577
 activity:user:990192934408163330 | 1000054910433103883 | 1000054910433103883
 activity:user:990192934408163330 | 1000054914723876869 | 1000054914723876869
 activity:user:990192934408163330 | 1000054919010455568 | 1000054919010455568


CREATE TABLE minds.entities_by_time (
    key text,
    column1 text,
    value text,
    PRIMARY KEY (key, column1)
) WITH COMPACT STORAGE
    AND CLUSTERING ORDER BY (column1 ASC)
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
    AND compression = {'enabled': 'false'}
    AND crc_check_chance = 1.0
    AND dclocal_read_repair_chance = 0.0
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair_chance = 0.1
    AND speculative_retry = '99PERCENTILE';

Through inquiry, it is found that in Cassandra, 1007227353832624141 is less than 963426376394739730. Why?

标签: cassandra cql
1条回答
够拽才男人
2楼-- · 2020-04-21 07:46

Good call Chris! The table definition tells it all! I recreated your table and ran queries sorting in both directions:

flynn@cqlsh:stackoverflow> SELECT * FROM entities_by_time
     WHERE key='activity:user:990192934408163330'  ORDER BY column1 DESC;

 key                              | column1             | value
----------------------------------+---------------------+---------------------
 activity:user:990192934408163330 |  999979571363188746 |  999979571363188746
 activity:user:990192934408163330 |  999979567064027139 |  999979567064027139
 activity:user:990192934408163330 |  963426376394739730 |  963426376394739730
 activity:user:990192934408163330 | 1007227353832624141 | 1007227353832624141
 activity:user:990192934408163330 | 1000054884671688706 | 1000054884671688706
 activity:user:990192934408163330 | 1000054880351555598 | 1000054880351555598

(6 rows)

flynn@cqlsh:stackoverflow> SELECT * FROM entities_by_time
     WHERE key='activity:user:990192934408163330'  ORDER BY column1 ASC;

 key                              | column1             | value
----------------------------------+---------------------+---------------------
 activity:user:990192934408163330 | 1000054880351555598 | 1000054880351555598
 activity:user:990192934408163330 | 1000054884671688706 | 1000054884671688706
 activity:user:990192934408163330 | 1007227353832624141 | 1007227353832624141
 activity:user:990192934408163330 |  963426376394739730 |  963426376394739730
 activity:user:990192934408163330 |  999979567064027139 |  999979567064027139
 activity:user:990192934408163330 |  999979571363188746 |  999979571363188746

(6 rows)

So to your question...

in Cassandra, 1007227353832624141 is less than 963426376394739730. Why?

Simply put, because 9 > 1, that's why.

Your table definition clusters on column1, which is a TEXT/UTF8 string and not a numeric. Essentially, Cassandra is sorting strings the only way it knows how - in ASCII-betical order, which is not alpha-numeric order.

Store your numerics as numerics, and sorting will behave in ways that are more predictable.

查看更多
登录 后发表回答