Use Redis to generate unique ID from a limited ran

2019-05-22 02:40发布

问题:

I have database items that, in addition to their primary key, need an index unique to the group in which the items belong. Let's call the property nbr, and the property that groups items together and defines the scope of unique nbr:s we'll call group. This nbr must be in the [1-N] range, and may be set when items are imported from an external source. Since all items must have a nbr, the task then becomes how to track which values are used, to enable picking a free nbr for new items that are added manually.

I'm using DynamoDB and Redis. I cannot have a DynamoDB index on nbr. The idea I have so far is to use Redis to track which numbers have been used for a particular group, so that for a Redis key such as <MYGROUP>-item-nbrs I could store all the used nbr:s and implement logic to find the next free nbr. Holes in the range of used nbr is acceptable, but holes should be filled before considering the numbers exhausted.

Essentially I want to find unused indices of a sparse array of max size N.

What would be a good structure for storing this information in Redis to quickly find a free nbr? My ideas so far include:

  • A single comma-separated string of all used nbr's in sorted order? To find a free nbr, a GET command is issued and the string is parsed until a hole is found or the end of the list, the picked number is inserted into the string and then the entire string is replaced. When N is large, this seems very inefficient.

  • A hash where each used nbr is stored as its own field, and using e.g. HSCAN to iterate through the fields of the hash to find a free nbr. When N is large, the HSCAN must scan a lot of fields.

  • Partitioning my nbr:s into fields called say p1-20, p21-40, p41-60, ..., each containing a sorted set of the used nbr:s within that partition only, and when a partition is exhausted (no more free nbr:s), remove it completely to speed up further iteration. Use HSCAN to iterate, and HSET to start a new partition.

  • Storing all free nbr instead of all used, and using sorted sets and ZPOPMIN or regular lists and LPOP, possibly partitioned into sub-sets. Pre-populating Redis with all free nbr 1-N seems ugly though.

Let's say N is in the magnitude of 65536.

Are any of the solutions above better/worse, for performance or other reasons? Is there a better/smarter way, maybe utilizing some clever aspect of Redis that I'm unaware of?

Edit:

Kevin's answer resulted in the following solution (pseudo code):

function getFreeNbr() {
  while (true) {
    send "WATCH numbers"
    nbr = send "BITPOS numbers 0"

    if nbr < N
      send "MULTI"
      send "SETBIT numbers $nbr 1"
      if send "EXEC" != NULL
        return nbr
      end if
    else 
      send "UNWATCH numbers"
      return -1
    end if
  }
}

回答1:

How about using Bitmaps to record, for every possible nbr, whether or not that value is used?

To record that a value is taken use SETBIT:

SETBIT key [nbr] 1

To find a free nbr use BITPOS:

BITPOS key 0

To avoid race conditions you'll want to make sure your get-and-set is atomic. [The OP addresses this in a follow-up question.]

This will require very little memory (8K bytes for 65536 possible values). BITPOS is O(n), but that's unlikely to be a real problem.