Integer ID obfuscation techniques

2020-02-01 02:33发布

问题:

I'm looking for an easy and reversible method of obfuscating integer IDs. Ideally, I'd want the resulting obfuscation to be at most eight characters in length and non-sequential, meaning that the obfuscation of "1" should look nothing like the obfuscation for "2" and so on.

This isn't meant to be secure by any means, so this isn't a huge concern. Additionally, the integers I'll be obfuscating aren't large - between one and 10,000 - but I don't want any collisions, either.

Does anybody have any ideas for something that would fit this criteria?

回答1:

I derived an idea from Pearson hashing which will work for arbitrary inputs as well, not just 32-bit integers. I don't know if this is the exact same as Greg answer, but I couldn't get at what he meant. But what I do know is that the memory requirements are constant here. No matter how big the input, this is still a reliable obfuscation/encryption trick.

For the record, this method is not hashing, and it does not have collisions. It's a perfectly sound method of obfuscating a byte string.

What you need for this to work is a secret key _encryptionTable which is a random permutation of the inclusive range 0..255. You use this to shuffle bytes around. To make it really hard to reverse it uses XOR to mix the byte string a bit.

public byte[] Encrypt(byte[] plaintext)
{
    if (plaintext == null)
    {
        throw new ArgumentNullException("plaintext");
    }
    byte[] ciphertext = new byte[plaintext.Length];
    int c = 0;
    for (int i = 0; i < plaintext.Length; i++)
    {
        c = _encryptionTable[plaintext[i] ^ c];
        ciphertext[i] = (byte)c;
    }
    return ciphertext;
}

You can then use the BitConverter to go between values and byte arrays or some convert to base 64 or 32 to get a textual representation. Base 32 encoding can be URL friendly if that's important. Decrypting is as simply as reversing the operation by computing the inverse of the _encryptionTable.

    public byte[] Decrypt(byte[] ciphertext)
    {
        if (ciphertext == null)
        {
            throw new ArgumentNullException("ciphertext");
        }
        byte[] plaintext = new byte[ciphertext.Length];
        int c = 0;
        for (int i = 0; i < ciphertext.Length; i++)
        {
            plaintext[i] = (byte)(_decryptionTable[ciphertext[i]] ^ c);
            c = ciphertext[i];
        }
        return plaintext;
    }

You can also do other fun things if you're working on a 32-bit integer and only care about the numbers greater than or equal to 0 which makes it harder to guess an obfuscated number.

I also use a secret word to seed a pseudo number generator and use that to setup the initial permutation. That's why I can simply get the value by knowing what secret word I used to create every thing.

var mt = new MersenneTwister(secretKey.ToUpperInvariant());
var mr = new byte[256];
for (int i = 0; i < 256; i++)
{
    mr[i] = (byte)i;
}
var encryptionTable = mt.NextPermutation(mr);
var decryptionTable = new byte[256];
for (int i = 0; i < 256; i++)
{
    decryptionTable[encryptionTable[i]] = (byte)i;
}
this._encryptionTable = encryptionTable;
this._decryptionTable = decryptionTable;

This is somewhat secure, the biggest flaw here is that the encryption, XOR with 0, happens to be the identity of XOR and doesn't change the value (a ^ 0 == a). Thus the first encrypted byte represent the random position of that byte. To work around this you can pick a initial value for c, that is not constant, based of the secret key by just asking the PRNG (after init with seed) for a random byte. That way it's immensely more difficult even with a large sample to crack the encryption as long as you can't observe input and output.



回答2:

If you've only got about 10,000 integers then the easiest and most reliably way would probably be a mapping table between the integer and a randomly generated string. Either generate a bunch of random identifiers up-front that correspond to each integer, or just fill them in on demand.

This way you can guarantee no collisions, and don't have to worry about encryption because there's nothing to decrypt as the strings are not derived from the integers themselves.

You could implement it in a database table or in memory (e.g. a two-way dictionary) depending on your needs.



回答3:

I realise this was asked 7 months ago so you will have found a solution by now, but solution I've come across is a combination of Skip32/Skipjack cipher + a base32 encoding. The perl example (since that's where I know of one) shows:

use Crypt::Skip32::Base32Crockford;
my $key    = pack( 'H20', "112233445566778899AA" ); # Always 10 bytes!
my $cipher = Crypt::Skip32::Base32Crockford->new($key);
my $b32    = $cipher->encrypt_number_b32_crockford(3493209676); # 1PT4W80
my $number = $cipher->decrypt_number_b32_crockford('1PT4W80'); # 3493209676

I don't know of a c# implementation, but a perl one is http://search.cpan.org/perldoc?Crypt::Skip32::Base32Crockford and the two constituent parts for a ruby one are https://github.com/levinalex/base32 and https://github.com/patdeegan/integer-obfuscator. Between the two of them you should be able to port it to any language you need.



回答4:

XOR is a nice and fast way of obfuscating integers:

1 xor 1234 = 1235
2 xor 1234 = 1232
3 xor 1234 = 1233
100 xor 1234 = 1206
120 xor 1234 = 1194

It's fast, and xor-ing again with the same number gives you back the original! The only trouble is, if an "attacker" knows any of the numbers, they can trivially figure out the xor mask... by xor-ing the result with the known original!

For example I (the "attacker") now that the 4th number in that list is an obfuscated "100". So I'll do:

100 xor 1206 = 1234

... and now I've got the XOR mask and I can un-obfuscated any of the numbers. Happily there are trivial solution to that problem. Algoritmically alter the XOR mask. For example, if you need to obfuscate 1000 integers in an array, start with a XOR mask of "1234" and do increment the MASK with 4 for each number in the arrray.



回答5:

In case other people are interested, somebody adapted a 32-bit block cipher a few years back that's especially useful for this task.

  • http://www.qualcomm.com.au/PublicationsDocs/skip32.c

There is also Perl and Ruby port of the above available:

  • https://github.com/gitpan/Crypt-Skip32
  • https://github.com/patdeegan/integer-obfuscator

If you need the result in 8 characters or less, you can use a hex or base64 representation.



回答6:

Nice project for handling that with libraries in most languages: http://hashids.org/



回答7:

Update May 2017

Feel free to use (or modify) the library I developed, installable via Nuget with:

Install-Package Kent.Cryptography.Obfuscation

This converts a non-negative id such as 127 to 8-character string, e.g. xVrAndNb, and back (with some available options to randomize the sequence each time it's generated).

Example Usage

var obfuscator = new Obfuscator();
string maskedID = obfuscator.Obfuscate(15);

Full documentation at: Github.


Old Answer

Just adding variety to an old answer. Perhaps someone will need it. This is an obfuscation class I made sometime back.

Obfuscation.cs - Github

You can use it by:

Obfuscation obfuscation = new Obfuscation();
string maskedValue = obfuscation.Obfuscate(5);
int? value = obfuscation.DeObfuscate(maskedValue);

Cheers, hopefully it can be of use.



回答8:

You could play with the bitpatterns of the number - eg rotates and swaps on the bits. That will give you a way to move between a number of say 26 bits and another number of 26 bits that won't be immediately obvious to a human observer. Though its by no means "secure".



回答9:

I realize this is an old post, but I thought it might be helpful to post my technique for obfuscating integer ids

Cons: does use more than 8 characters, only good for id values under 33 million

Pros: does not require a key to de-obfuscate, URL/cookie friendly, generates a different value every time which makes it harder to break, no collisions, includes a checksum feature to eliminate random / brute force attempts to break (one issue that the above post do not address, is people trying to "scrape" your site. If I see a url ending in id=123 I know I can try id=124 etc... to get additional data, this is why some of the XOR examples are likely not a good idea)

I would recommend tweaking this a bit (which I've done for mine) as I don't think you should ever use a publically published obfuscation technique, but it is a good place to start.

Happy coding!

    public static string ObfuscateId(int id)
    {
        try
        {
            string rtn;
            int sid = id + 279;
            int xm = sid * 3;
            int xl = xm.ToString().Length + 10;
            string sc = xl.ToString().Substring(1, 1);
            string fc = xl.ToString().Substring(0, 1);
            string csum = sid.ToString().Substring(sid.ToString().Length - 3);
            rtn = Guid.NewGuid().ToString().Replace("-", "").ToLower();
            rtn = sc + rtn.Substring(2, 26) + fc;
            rtn = rtn.Remove(4, 3).Insert(4, csum);
            rtn = rtn.Remove(xl, (xl - 10)).Insert(xl, xm.ToString());
            rtn = rtn.Replace('1', 'g');
            rtn = rtn.Replace('2', 'h');
            rtn = rtn.Replace('3', 'i');
            rtn = rtn.Replace('4', 'w');
            rtn = rtn.Replace('5', 'y');
            rtn = rtn.Replace('6', 'u');
            rtn = rtn.Replace('7', 'z');
            rtn = rtn.Replace('8', 'l');
            rtn = rtn.Replace('9', 'v');
            rtn = rtn.Replace('0', 'n');
            rtn = rtn.Replace('c', 'j');
            rtn = rtn.Replace('d', 'p');
            rtn = rtn.Replace('f', 'q');

            return rtn.ToUpper();
        }
        catch
        {
            return "ERROR BAD ID";
        }
    }

    public static int DeObfuscateId(string obtxt)
    {
        try
        {
            string rtn;
            int id;

            rtn = obtxt.ToLower();
            rtn = rtn.Replace('g', '1');
            rtn = rtn.Replace('h', '2');
            rtn = rtn.Replace('i', '3');
            rtn = rtn.Replace('w', '4');
            rtn = rtn.Replace('y', '5');
            rtn = rtn.Replace('u', '6');
            rtn = rtn.Replace('z', '7');
            rtn = rtn.Replace('l', '8');
            rtn = rtn.Replace('v', '9');
            rtn = rtn.Replace('n', '0');
            rtn = rtn.Replace('j', 'c');
            rtn = rtn.Replace('p', 'd');
            rtn = rtn.Replace('q', 'f');

            string sc = rtn.Substring(0, 1);
            string fc = rtn.Substring(rtn.Length - 1);
            int xl = int.Parse(fc + sc);
            int mv = int.Parse(rtn.Substring(xl, (xl - 10)));
            int sid = mv / 3;
            id = sid - 279;
            string csum = sid.ToString().Substring(sid.ToString().Length - 3);
            string xsum = rtn.Substring(4, 3);

            if (csum!=xsum)
            {
                return -99999;
            }

            return id;
        }
        catch
        {
            return -99999;
        }
    }

}


回答10:

Just get a MD5/SHA1 hash of the integer's byte representation. You are guaranteed not to get collisions.