So I'm on a quest to build a lua script that uses SCAN to find keys based on a pattern and Delete them (atomically). I first prepared the following script
local keys = {};
local done = false;
local cursor = "0"
repeat
local result = redis.call("SCAN", cursor, "match", ARGV[1], "count", ARGV[2])
cursor = result[1];
keys = result[2];
for i, key in ipairs(keys) do
redis.call("DEL", key);
end
if cursor == "0" then
done = true;
end
until done
return true;
Which would spit back the following "Err: @user_script: 9: Write commands not allowed after non deterministic commands " So I thought about it a bit and came up with the following script:
local all_keys = {};
local keys = {};
local done = false;
local cursor = "0"
repeat
local result = redis.call("SCAN", cursor, "match", ARGV[1], "count", ARGV[2])
cursor = result[1];
keys = result[2];
for i, key in ipairs(keys) do
all_keys[#all_keys+1] = key;
end
if cursor == "0" then
done = true;
end
until done
for i, key in ipairs(all_keys) do
redis.call("DEL", key);
end
return true;
which still returns the same error (@user_script: 17: Write commands not allowed after non deterministic commands). This has me stumped. Is there any way to circumvent this issue?
script was run using phpredis and the following
$args_arr = [
0 => 'test*', //pattern
1 => 100, //count for SCAN
];
var_dump($redis->eval($script, $args_arr, 0));
UPDATE: the below applies to Redis versions up to 3.2. From that version, effect-based replication lifts the ban on non-determinism so all bets are off (or rather, on).
You can't (and shouldn't) mix the
SCAN
family of commands with any write command in a script because the former's reply is dependent on the internal Redis data structures that, in turn, are unique to the server process. Put differently, two Redis processes (e.g. master and slave) are not guaranteed to return the same replies (so in Redis replication context [which isn't operation- but statement-based] that would break it).Redis tries to protect itself against such cases by blocking any write command (such as
DEL
) if it is executed after a random command (e.g.SCAN
but alsoTIME
,SRANDMEMBER
and similar). I'm sure there are ways to get around that, but would you want to do that? Remember, you'll be going into unknown territory where the system's behavior is not defined.Instead, accept the fact that you shouldn't mix random reads and writes and try to think of a different approach for solving your problem, namely deleting a bunch of keys according to a pattern in an atomic way.
First ask yourself if you can relax any of the requirements. Does it have to be atomic? Atomicity means that Redis will be blocked for the duration of the deletion (regardless the final implementation) and that the length of the operation depends on the size of the job (i.e. number of keys that are deleted and their contents [deleting a large set is more expensive than deleting a short string for example]).
If atomicity isn't a must, periodically/lazily
SCAN
and delete in small batches. If it is a must, understand that you're basically trying to emulate the evilKEYS
command :) But you can do better if you have prior knowledge of the pattern.Assuming the pattern is known during runtime of your application, you can collect the relevant keys (e.g. in a Set) and then use that collection to actualize the delete in an atomic and replication-safe manner that's more efficient compared to going over the entire keyspace.
However, the most "difficult" problem is if you need to run ad-hoc pattern matching while ensuring atomicity. If so, the problem boils down to obtaining a filtered-by-pattern snapshot of the keyspace immediately followed by a succession of deletes (re-emphasizing: while the database is blocked). In that case you can very well use
KEYS
within your Lua script and hope for the best... (but knowing full well that you may resort toSHUTDOWN NOSAVE
quite quickly :P).The Last Optimization is to index the keyspace itself. Both
SCAN
andKEYS
are basically full table scans, so what if we were to index that table? Imagine keeping an index on keys' names that can be queried during a transaction - you can probably use a Sorted Set and lexicographical ranges (HT @TwBert) to do away with most of the pattern matching needs. But at a significant cost... not only will you be doing double bookkeeping (storing each key's name costs in RAM and CPU), you'd be forced to add complexity to your application. Why adding complexity? Because to implement such an index you'd have to maintain it yourself in the application layer (and possibly all your other Lua scripts), carefully wrapping each write operation to Redis in a transaction that also updates the index.Assuming you did all that (and taking into account the obvious pitfalls like the added complexity's potential for bugs, at-least doubled write load on Redis, RAM & CPU, restrictions on scaling and so forth...) you can pat yourself on the shoulder and congratulate yourself for using Redis in a way that it wasn't designed for. While upcoming versions of Redis may (or may not) include better solutions for this challenge (@TwBert - want to do a joint RCP/contrib and again hack Redis a little?), before trying this I really urge you to rethink the original requirements and verify that you're using Redis correctly (i.e. designing your "schema" according to your data access needs).