Confused about hashes

2019-02-10 14:09发布

say I have a blob of text 5000 characters. I run it through a hashing program and generates a 40 char long hash. now i run another blob of text, 10000 characters. it still generates a hash 40 chars long. that's true for text of any length.

my question is if the hashes are all unique, wouldn't i be able to compress anything into a 40 char string?

标签: hash
12条回答
Deceive 欺骗
2楼-- · 2019-02-10 14:52

They are not unique, but you're much more likely to drop dead of a heart attack before you find two different documents with the same hash for a high quality algorithm, e.g. SHA-1

查看更多
Deceive 欺骗
3楼-- · 2019-02-10 15:00

Don't get confused by the .Net GetHashCode(). It's not very good as it's only 32 bits compared to 640 bits in the original question (if each character is 8 bits).

查看更多
我想做一个坏孩纸
4楼-- · 2019-02-10 15:02

Not all hashes are guaranteed to be unique. The wikipedia entry on the topic is pretty good: http://en.wikipedia.org/wiki/Hash_function

查看更多
Lonely孤独者°
6楼-- · 2019-02-10 15:04

You can compress the signature of any text into a hash, but you cannot reverse calculate what the text was to give you that hash. Simply speaking the only way to find out what the text was that gave you the hash would be to brute-force text through the hash to try and find a match.

See Wikipedia

查看更多
对你真心纯属浪费
7楼-- · 2019-02-10 15:06

Consider looking at this from the point of view of the Pigeonhole Principle. If you're stuffing n items into a smaller number of buckets k, there will necessarily be some buckets with multiple items. So to answer your question, no hashes are not unique.

查看更多
登录 后发表回答