I'm looking for an easy and reversible method of obfuscating integer IDs. Ideally, I'd want the resulting obfuscation to be at most eight characters in length and non-sequential, meaning that the obfuscation of "1" should look nothing like the obfuscation for "2" and so on.
This isn't meant to be secure by any means, so this isn't a huge concern. Additionally, the integers I'll be obfuscating aren't large - between one and 10,000 - but I don't want any collisions, either.
Does anybody have any ideas for something that would fit this criteria?
I derived an idea from Pearson hashing which will work for arbitrary inputs as well, not just 32-bit integers. I don't know if this is the exact same as Greg answer, but I couldn't get at what he meant. But what I do know is that the memory requirements are constant here. No matter how big the input, this is still a reliable obfuscation/encryption trick.
For the record, this method is not hashing, and it does not have collisions. It's a perfectly sound method of obfuscating a byte string.
What you need for this to work is a secret key
_encryptionTable
which is a random permutation of the inclusive range 0..255. You use this to shuffle bytes around. To make it really hard to reverse it uses XOR to mix the byte string a bit.You can then use the BitConverter to go between values and byte arrays or some convert to base 64 or 32 to get a textual representation. Base 32 encoding can be URL friendly if that's important. Decrypting is as simply as reversing the operation by computing the inverse of the
_encryptionTable
.You can also do other fun things if you're working on a 32-bit integer and only care about the numbers greater than or equal to 0 which makes it harder to guess an obfuscated number.
I also use a secret word to seed a pseudo number generator and use that to setup the initial permutation. That's why I can simply get the value by knowing what secret word I used to create every thing.
This is somewhat secure, the biggest flaw here is that the encryption, XOR with 0, happens to be the identity of XOR and doesn't change the value (
a ^ 0 == a
). Thus the first encrypted byte represent the random position of that byte. To work around this you can pick a initial value forc
, that is not constant, based of the secret key by just asking the PRNG (after init with seed) for a random byte. That way it's immensely more difficult even with a large sample to crack the encryption as long as you can't observe input and output.I realize this is an old post, but I thought it might be helpful to post my technique for obfuscating integer ids
Cons: does use more than 8 characters, only good for id values under 33 million
Pros: does not require a key to de-obfuscate, URL/cookie friendly, generates a different value every time which makes it harder to break, no collisions, includes a checksum feature to eliminate random / brute force attempts to break (one issue that the above post do not address, is people trying to "scrape" your site. If I see a url ending in id=123 I know I can try id=124 etc... to get additional data, this is why some of the XOR examples are likely not a good idea)
I would recommend tweaking this a bit (which I've done for mine) as I don't think you should ever use a publically published obfuscation technique, but it is a good place to start.
Happy coding!
You could play with the bitpatterns of the number - eg rotates and swaps on the bits. That will give you a way to move between a number of say 26 bits and another number of 26 bits that won't be immediately obvious to a human observer. Though its by no means "secure".
I realise this was asked 7 months ago so you will have found a solution by now, but solution I've come across is a combination of Skip32/Skipjack cipher + a base32 encoding. The perl example (since that's where I know of one) shows:
I don't know of a c# implementation, but a perl one is http://search.cpan.org/perldoc?Crypt::Skip32::Base32Crockford and the two constituent parts for a ruby one are https://github.com/levinalex/base32 and https://github.com/patdeegan/integer-obfuscator. Between the two of them you should be able to port it to any language you need.
Nice project for handling that with libraries in most languages: http://hashids.org/
If you've only got about 10,000 integers then the easiest and most reliably way would probably be a mapping table between the integer and a randomly generated string. Either generate a bunch of random identifiers up-front that correspond to each integer, or just fill them in on demand.
This way you can guarantee no collisions, and don't have to worry about encryption because there's nothing to decrypt as the strings are not derived from the integers themselves.
You could implement it in a database table or in memory (e.g. a two-way dictionary) depending on your needs.