谁能告诉我,为什么数量在5381 DJB哈希函数用于?
DJB散列函数
H(0)= 5381
H(1)= 33 * H(I-1)^ STR [1]
一个C程序:
unsigned int DJBHash(char* str, unsigned int len)
{
unsigned int hash = 5381;
unsigned int i = 0;
for(i = 0; i < len; str++, i++)
{
hash = ((hash << 5) + hash) + (*str);
}
return hash;
}
5381仅仅是一个数字,在测试中,导致较少的冲突和更好的雪崩 。 您可以在几乎所有的哈希算法中找到“魔术常量”。
我偶然发现一个评论即揭示了什么DJB达到一些轻:
/*
* DJBX33A (Daniel J. Bernstein, Times 33 with Addition)
*
* This is Daniel J. Bernstein's popular `times 33' hash function as
* posted by him years ago on comp.lang.c. It basically uses a function
* like ``hash(i) = hash(i-1) * 33 + str[i]''. This is one of the best
* known hash functions for strings. Because it is both computed very
* fast and distributes very well.
*
* The magic of number 33, i.e. why it works better than many other
* constants, prime or not, has never been adequately explained by
* anyone. So I try an explanation: if one experimentally tests all
* multipliers between 1 and 256 (as RSE did now) one detects that even
* numbers are not useable at all. The remaining 128 odd numbers
* (except for the number 1) work more or less all equally well. They
* all distribute in an acceptable way and this way fill a hash table
* with an average percent of approx. 86%.
*
* If one compares the Chi^2 values of the variants, the number 33 not
* even has the best value. But the number 33 and a few other equally
* good numbers like 17, 31, 63, 127 and 129 have nevertheless a great
* advantage to the remaining numbers in the large set of possible
* multipliers: their multiply operation can be replaced by a faster
* operation based on just one shift plus either a single addition
* or subtraction operation. And because a hash function has to both
* distribute good _and_ has to be very fast to compute, those few
* numbers should be preferred and seems to be the reason why Daniel J.
* Bernstein also preferred it.
*
*
* -- Ralf S. Engelschall <rse@engelschall.com>
*/
这比你在寻找一个稍微不同的散列函数,虽然它使用5831幻数。 在链接目标注释下面的代码已经展开。
然后我发现这个 :
Magic Constant 5381: 1. odd number 2. prime number 3. deficient number 4. 001/010/100/000/101 b
还有这个答案, 任何人可以解释背后djb2哈希函数的逻辑是什么? 它引用一个岗位由DJB自己是提到5381(摘自这个答案在这里摘录)邮件列表:
[...]几乎任何良好的乘数工作。 我想你担心的事实是31C + d不包括散列值的任何合理的范围内,如果c和d是0到255,这就是为什么,当我发现的33散列函数,并在我的压缩机开始使用它之间,我开始用的5381.哈希值我想你会发现,这不只是和一个261事半功倍。
我发现了一个很有趣的这个号码的属性可以是可以成为一个理由。
5381是第709黄金。
709是第127位的素数。
127是31黄金。
31是第11届总理。
11是第五素。
5是第三个首相。
3是第2个素数。
2月1日是黄金。
5381是第一个数字为发生这种情况为8次。 所以这是一个很好的点停止链第五千三百八十一黄金可能超过符号整型的限制。