I have database items that, in addition to their primary key, need an index unique to the group in which the items belong. Let's call the property nbr
, and the property that groups items together and defines the scope of unique nbr
:s we'll call group
. This nbr
must be in the [1-N] range, and may be set when items are imported from an external source. Since all items must have a nbr
, the task then becomes how to track which values are used, to enable picking a free nbr
for new items that are added manually.
I'm using DynamoDB and Redis. I cannot have a DynamoDB index on nbr
. The idea I have so far is to use Redis to track which numbers have been used for a particular group, so that for a Redis key such as <MYGROUP>-item-nbrs
I could store all the used nbr
:s and implement logic to find the next free nbr
. Holes in the range of used nbr
is acceptable, but holes should be filled before considering the numbers exhausted.
Essentially I want to find unused indices of a sparse array of max size N.
What would be a good structure for storing this information in Redis to quickly find a free nbr
? My ideas so far include:
A single comma-separated string of all used nbr's in sorted order? To find a free nbr, a
GET
command is issued and the string is parsed until a hole is found or the end of the list, the picked number is inserted into the string and then the entire string is replaced. When N is large, this seems very inefficient.A hash where each used
nbr
is stored as its own field, and using e.g.HSCAN
to iterate through the fields of the hash to find a freenbr
. When N is large, the HSCAN must scan a lot of fields.Partitioning my
nbr
:s into fields called say p1-20, p21-40, p41-60, ..., each containing a sorted set of the usednbr
:s within that partition only, and when a partition is exhausted (no more freenbr
:s), remove it completely to speed up further iteration. Use HSCAN to iterate, and HSET to start a new partition.Storing all free
nbr
instead of all used, and using sorted sets and ZPOPMIN or regular lists and LPOP, possibly partitioned into sub-sets. Pre-populating Redis with all freenbr
1-N seems ugly though.
Let's say N is in the magnitude of 65536.
Are any of the solutions above better/worse, for performance or other reasons? Is there a better/smarter way, maybe utilizing some clever aspect of Redis that I'm unaware of?
Edit:
Kevin's answer resulted in the following solution (pseudo code):
function getFreeNbr() {
while (true) {
send "WATCH numbers"
nbr = send "BITPOS numbers 0"
if nbr < N
send "MULTI"
send "SETBIT numbers $nbr 1"
if send "EXEC" != NULL
return nbr
end if
else
send "UNWATCH numbers"
return -1
end if
}
}