What algorithm can I use to generate a 48-bit hash

2019-05-21 02:34发布

Is there any simple to use hashing algorithm to generate a 48-bit hash? I need to generate unique MAC addresses from unique input strings. There are no security issues here, just a question of mapping it to MAC address space (which is 48 bits).

I thought about CRC32, which is just 32 bits and easy (and cksum is on every Linux), and use them for the lower 32 bits, but the collisions are pretty high if it is more than a few hosts.

If I could get 48 bit hash, I could mask the second least significant bit of the 8 most significant bits to ensure it is a Locally Administered Address. The loss of a single bit is minor.

Conversely, I could use a longer hashing algorithm (MD5, SHA1, etc.) and just take the 48 most significant or least significant bits.

Is there a simple way to do this?

My preference is command-line utility, but if I have to write short python or similar, no big deal.

1条回答
Deceive 欺骗
2楼-- · 2019-05-21 03:28

After 2 years, here's an idea, in a real application (very close to what you needed).

I just needed a serial number of only 48 bits for a custom board that doesn't have a non-volatile memory. The board features a STM32 processor that has an unique ID of 96 bits (STM32_UUID).

Here is the complete C code:

#define STM32_UUID                      ((uint8_t*)0x1FFFF7E8)

// board SN 48 bit
static uint8_t BoardSerial[6]; 

void setBoardSerial(void)
{
  uint64_t hash = fastHash64(STM32_UUID, 12, 1234554321);
  memcpy(BoardSerial, &hash, 6);
}

static inline uint64_t mix(uint64_t h)
{
    h ^= h >> 23;
    h *= 0x2127599bf4325c37ULL;
    h ^= h >> 47;
    //
    return h;
}

uint64_t fastHash64(const void * buf, size_t len, uint64_t seed)
{
    const uint64_t m = 0x880355f21e6d1965ULL;
    const uint64_t * pos = (const uint64_t*)buf;
    const uint64_t * end = pos + (len / 8);
    const unsigned char * pos2;
    uint64_t h = seed ^ (len * m);
    uint64_t v;

    while(pos != end)
    {
        v  = *pos++;
        h ^= mix(v);
        h *= m;
    }

    pos2 = (const unsigned char*)pos;
    v = 0;

    switch(len & 7)
    {
        case 7: v ^= (uint64_t)pos2[6] << 48;
        case 6: v ^= (uint64_t)pos2[5] << 40;
        case 5: v ^= (uint64_t)pos2[4] << 32;
        case 4: v ^= (uint64_t)pos2[3] << 24;
        case 3: v ^= (uint64_t)pos2[2] << 16;
        case 2: v ^= (uint64_t)pos2[1] << 8;
        case 1: v ^= (uint64_t)pos2[0];
                h ^= mix(v);
                h *= m;
    }

    return mix(h);
}

I tested this solution on a batch of about 200 units (boards) and there was absolutely no problem, no conflicts. I've seen a lot of people having this issue, when they needed a smaller device ID that somehow originates from a large unique unit serial number.

Alternatively, you may search for an implementation of Bobcat 48 bit hash.

查看更多
登录 后发表回答