Compile-time (preprocessor) hashing of string

2019-02-06 04:34发布

问题:

Is there any way to create a hash of string at compile time using the C/C++ preprocessor (or even template-metaprogramming)?

e.g. UNIQUE_SALT("HelloWord", 3DES);

The idea is that HelloWorld will not be present in the compiled binary, just a hash.

Edit: There are many of these declarations spread over a large codebase.

回答1:

With C++0x, this is possible as covered by answers in #1 and #2.

In C++03 there was no compile time string processing. With the preprocessor you can't seperate the string into tokens, with templates you can't access single characters. There was however a discussion on the speculated approach using C++0x.

What you could do for C++03 is to pass the string character-wise (possible using multi-character literals):

foo = hash<3DES, str<'a','b','c'> >::result;
// or:
foo = hash<3DES, str<'abc','def'> >::result;

... or simply do it as a pre-build step.



回答2:

Why not make generating the hash part of your build process? You can write a simple batch file to generate the hash (assuming you have a program to do such a thing - if not, write one) and have it output a preprocessor directive something like:

#define MY_HASH 123456789 

to a .h file that then gets #included in your application.



回答3:

While this is not a proper answer to the question, see this blog entry for an example of a hash function for strings of up to 256 characters implemented purely as a C macro:

http://lolengine.net/blog/2011/12/20/cpp-constant-string-hash

Here is the actual code from the blog:

#include <string.h>
#include <stdint.h>
#include <stdio.h>

#define H1(s,i,x)   (x*65599u+(uint8_t)s[(i)<strlen(s)?strlen(s)-1-(i):strlen(s)])
#define H4(s,i,x)   H1(s,i,H1(s,i+1,H1(s,i+2,H1(s,i+3,x))))
#define H16(s,i,x)  H4(s,i,H4(s,i+4,H4(s,i+8,H4(s,i+12,x))))
#define H64(s,i,x)  H16(s,i,H16(s,i+16,H16(s,i+32,H16(s,i+48,x))))
#define H256(s,i,x) H64(s,i,H64(s,i+64,H64(s,i+128,H64(s,i+192,x))))

#define HASH(s)    ((uint32_t)(H256(s,0,0)^(H256(s,0,0)>>16)))

If you know ahead of time that you will only use it for static strings you could replace strlen() with sizeof().



回答4:

This can be done with Boost.MPL but it might not be the type of hash you are after.

http://arcticinteractive.com/2009/04/18/compile-time-string-hashing-boost-mpl/



回答5:

Even if this can't be (reasonably) done with the preprocessor, if you used a string literal or declared it as static const and did not create any lasting references to it the compiler will likely go ahead and do all of the math to generate the result and omit the string in the object file if you compile with optimizations. The hardest part of this is that you can't make the code to initialize a global or static variable too complicated or the compiler will say "hey, you! Don't you know you can't use a for loop outside of a function?".



回答6:

I stumbled across a solution using the good 'ol C++ standard (I'm not sure what version it's considered, but let's just say this solution works in Visual Studio). Here's the link: link.

Also, here's a short version of a JSHash function using the above technique. The shown one here supports up to 4 characters, though you can add as many as you want.

template<const char A = 0, const char B = 0, const char C = 0, const char D = 0>
struct cHash
{
    template<const char C, size_t hash = 1315423911>
    struct HashCalc
    {
        enum { value = (C == 0) ? hash : hash ^ ((hash << 5) + C + (hash >> 2)) };
    };

    enum { value = HashCalc<D,HashCalc<C,HashCalc<B,HashCalc<A>::value>::value>::value>::value };
};

As noted, because this is a compile time hash, you can do something like this:

namespace Section
{
    enum Enum
    {
        Player = cHash<'p','l','a','y'>::value
    };
}

It's not the most elegant solution, so I plan on doing more research in this area, however as this is the only thing I've gotten to work in VisualStudio2010 I'm a little limited as far as my current project is concerned.



回答7:

Answers claiming that strings cannot be parsed at compile time are wrong. Character pointers can't be parsed at compile time, but string literals are not character pointers; they are arrays of characters whose length is part of the type. It's easy to forget that, because in most cases it's far more useful to let them decay into a char*. But they don't start out that way.

Ah, but how to actually define a function that takes a fixed-length character array, especially if we'd really rather use it on strings of arbitrary length? That's where template argument deduction comes in super handy:

template<size_t L>
constexpr int hash(const char (&str)[L], int n = L - 1) {
    // hash goes here. You can define recursively as a
    // function of str[n] and hash(str, n-1). Remember to
    // treat 0 as a special case.
}

That should get you started. Obviously the hash itself needs to be simple enough for compile-time computation, but that's probably ok.



回答8:

This is how I do this compile-time string hash with C++0x:

class StringHash
{
public:
    template <unsigned N, unsigned I>
    struct HashHelper
    {
        constexpr static unsigned Calculate(const char (&str)[N])
        {
            return (HashHelper<N, I - 1>::Calculate(str) ^ (str[I - 1] & 0xFF)) * StringHash::PRIME;
        }
    };

    template <unsigned N>
    struct HashHelper<N, 1>
    {
        constexpr static unsigned Calculate(const char (&str)[N])
        {
            return (StringHash::OFFSET ^ (str[0] & 0xFF)) * StringHash::PRIME;
        }
    };

    template<unsigned N>
    constexpr static unsigned StaticHash(const char (&str)[N])
    {
        return HashHelper<N, N>::Calculate(str);
    }

    static const unsigned OFFSET = 0x01234567;
    static const unsigned PRIME = 0x89ABCDEF;
}

Usage:

static hash = StringHash::StaticHash("Hello"); // You may even use this expression in `switch case`