DELETE rows stored on a particular node

2019-09-17 08:46发布

问题:

How can I write a CQL 3 DELETE row specification (WHERE clause) that will select only rows that are stored on a given node? If that is not possible, is there a SELECT relation (WHERE clause) that will indicate which rows are stored on a particular node?

I want to do this so I can have a housekeeping daemon (in Java) running on each data-store node, which deletes old records from that node, so it can ensure that its node does not run out of disk space. As I am writing a daemon, rather than performing a one-off cleanup, it is not appropriate to use the nodetool program to query for the token ranges stored on a node.

回答1:

Here's one way that might work (but see below for a better idea). If you don't have vnodes enabled, you could identify the token ranges (from then nodetool ring command), then use them as part of your delete command. For example:

delete from MyTable where 
    token(MyPK) >= Token1 and 
    token(MyPK) < Token2 and
    (your delete logic here)
;

However, a much simpler and safer method would be to just let Cassandra figure out where the data is, and just do this from any node:

delete from MyTable where 
    (your delete logic here)
;


回答2:

nodetool getendpoints tells you which node owns a partition key: http://www.datastax.com/documentation/cassandra/2.1/cassandra/tools/toolsGetEndPoints.html?scroll=toolsGetEndPoints__toolsGetEndPtEx