Looking for a Hash Function /Ordered Int/ to /Shuf

2019-02-07 11:00发布

问题:

I am looking for constant time algorithm can change an ordered integer index value into a random hash index. It would nice if it is reversible. I need that hash key is unique for each index. I know that this could be done with a table look up in a large file. I.E. create an ordered set of all ints and then shuffle them randomly and write to a file in random sequence. You could then read them back as you need them. But this would require a seek into a large file. I wonder if there is a simple way to use say a pseudo random generator to create the sequence as needed?

Generating shuffled range using a PRNG rather than shuffling the answer by erikkallen of Linear Feedback Shift Registers looks like the right sort of thing. I just tried it but it produces repeats and holes.

Regards David Allan Finch

回答1:

The question is now if you need a really random mapping, or just a "weak" permutation. Assuming the latter, if you operate with unsigned 32-bit integers (say) on 2's complement arithmetics, multiplication by any odd number is a bijective and reversible mapping. Of course the same goes for XOR, so a simple pattern which you might try to use is e.g.

unsigned int hash(int x) {
   return (((x ^ 0xf7f7f7f7) * 0x8364abf7) ^ 0xf00bf00b) * 0xf81bc437;
}

There is nothing magical in the numbers. So you can change them, and they can be even randomized. The only thing is that the multiplicands must be odd. And you must be calculating with rollaround (ignoring overflows). This can be inverted. To do the inversion, you need to be able to calculate the correct complementary multiplicands A and B, after which the inversion is

unsigned int rhash(int h) {
    return (((x * B) ^ 0xf00bf00b) * A) ^ 0xf7f7f7f7;
}

You can calculate A and B mathematically, but the easier thing for you is just to run a loop and search for them (once offline, that is).

The equation uses XORs mixed with multiplications to make the mapping nonlinear.



回答2:

You could try building a suitable Feistel network. These are normally used for cryptography (e.g. DES), but with at least 64 bits, so you may need to build one yourself that suits your needs. They are invertible by construction.



回答3:

Assuming your goal is to spread out grouped values across the whole range,
it seems like shuffling the bits in some pre-defined order might do the trick.
i.e. given 8 bits ABCDEFGH, arrange them like EGDBHCFA, or some such pattern.

The code would just be a simple sequence of masks, shifts and adds.



回答4:

Mmm... depending if you have a lot of numbers, you could use a normal stl list, and order it by a "random" criteria

bool
nonsort(int i, int j)
{
    return  random() & 31 >16 ? true : false;
}

std::list<int> li;
// insert elements
li.sort(nonsort);

Then, you can get all the integers with a normal iterator. Remember to initialize random with srand() with time or any other pseudo-random value.



回答5:

For the set of constraints there really is no solution. An attempt to hash 32 bit unsigned, into a 32 bit unsigned, will give you collisions, unless you do a something simple, like a 1 to 1 mapping. Every number is its own hash.