Ways to convert special-purpoes-strings to Integer

2019-08-06 01:22发布

I need to have a in-memory data-structure of Key-Value pair (400 MB worth of data). I have following constraints on keys:

  1. Both key and values are text strings of length 256 and 1024 respectively.
  2. Any key generally looks like k1k2k3k4k5, each k(i) being 4-8 byte string in itself. Some k(i) may or may not be there in the keys.
  3. Every k(i) has 6-8 possibilities. However k3 and k4 have 256000 possibilities.
  4. One might iterate the DS with prefix_key. DS should be optimized for this operation. This operation allocates an iterator i.e it iterates the whole DS and returns list of key-values that match prefix_key (e.g. "k1k2k3.*", k(i) defined as above). Every iteration iterates on this iterator(list). Freeing the iterator frees the list.

Coming up with DS for string keys makes key comparisons too expensive. And thus certain option for the DS (Hash, B+Tree) get ruled out.

My question is how creatively can we convert String keys to integer keys?The solution needs to have following property:

For a key pattern "k1k2k3.*" it should genarate upper and lower bound on integers so that based on these bounds only handful number of entries are looked up in the DS.

I am asking this question in context of solution towards this

1条回答
\"骚年 ilove
2楼-- · 2019-08-06 01:57

Every k(i) has 6-8 possibilities. However k3 and k4 have 256000 possibilities.

If you can split key in k1 k2 k3 k4 k5, you could encode it like this:

 3 bits for k1  
 3 bits for k2  
18 bits for k3  
18 bits for k4  
 3 bits for k5

this makes 45bits. So you can boil down your key to an integer between 0 and 2^45-1. This seams to be a lot especially if you only use a few of the possible values for k3 and k4.

So I would take the 6 bits of k1 k2 for a exact mapping to an index and than depending how dense the k3 k4 is, some kind of tree structue for k3 and k4 and than an exact mapping to an index for k5 again.

查看更多
登录 后发表回答