How to store hashes in MySQL databases without usi

2019-04-13 11:48发布

I'm storing unique user-agents in a MySQL MyISAM table so when I have to look if it exists in the table, I check the md5 hash that is stored next to the TEXT field.

User-Agents 
{
    id         - INT
    user-agent - TEXT
    hash       - VARCHAR(32) // md5
}

There is any way to do the same but using a 32-bit integer and not a text hash? Maybe the md5 in raw format will be faster? That will requiere a binary search.

[EDIT]

MySQL don't handle hash searches for complete case-sensitive strings?

标签: php mysql hash md5
9条回答
欢心
2楼-- · 2019-04-13 12:16

You could do this instead:

User-Agents 
{
    id         - INT
    user-agent - TEXT
    hash       - UNSIGNED INT (CRC32, indexed)
}


$crc32 = sprintf("%u", crc32($user_agent));

SELECT * FROM user_agents WHERE hash=$crc32 AND user_agent='$user_agent';

It's unlikely that you'll get collisions with crc32 for this kind of data.

To guarantee that collisions will not cause problems, add a secondary search parameter. MySQL will be able to use the index to quickly find the record. Then it can do a simple string search to guarantee that match is correct.

PS: The sprintf() is there to work around signed 32-bit integers. Should be unnecessary on 64-bit systems.

查看更多
爷的心禁止访问
3楼-- · 2019-04-13 12:20

You can't store an MD5 hash in a 32-bit int: it simply won't fit. (It's 32 characters when written in hex, but it's 128-bits of data)

You could look at MySQL's BINARY and VARBINARY types. See http://dev.mysql.com/doc/refman/5.1/en/binary-varbinary.html. These types store binary data. In your case, BINARY(16) or VARBINARY(16), but since MD5 hashes are always 16 bytes, the latter seems a bit pointless.

查看更多
混吃等死
4楼-- · 2019-04-13 12:27

Store the UNHEX(MD5($value)) in a BINARY(16).

查看更多
登录 后发表回答