MD5 etc. as a hash function

2019-02-23 19:46发布

问题:

Let's say you are planning to design a hash function which will generate keys between 0-256. Will using first 2 digits of MD5-digest be a great idea for a uniform distribution? What do you think on this? Is it expensive to md5() some word (2-10 letters)?

I know it is a rough definition of requirements but it would be great to discuss this.

回答1:

You could try calculating an 8-bit CRC.



回答2:

There's no reason to use a cryptographic strength hash for something as simple as generating 3 digit hashes. You're better off using a more simple hash there.

I'm not certain specifically how expensive MD5 is relative to others, but there are plenty of better ways to create a small hash (see this article for some algorithm ideas).



回答3:

MD5 is designed to uniformaly spread the input over all the output bytes so it's as good as any other general hash function - sounds like a bit of overkill if you only want 256 values.

Note the output of MD5 is 128bytes (16bytes), it's only the text representation that is hex digits - so there is really no first two digits of MD5 - just use the bottom 8bits.



回答4:

You haven't explained how you're going to use the hash, and what you're going to do with the collisions that are inevitable given that you have only 256 output values.

I think even MD5 (which is not cryptographically secure any more) is overkill for the likely applications.

I'd probably go with a CRC (cyclic redundancy check) algorithm that would generate a 16-bit or 32-bit number for you, and would likely give you good enough distribution.



标签: hash