I need to compare a string to multiple other constant strings in c. I am curious which is faster, to hash the string I am going to compare and compare it to all the other constant string hashes or just compare the strings as strings. thank you in advance
thank you for the answers I am going to be doing many comparisons. can anyone give me a good, fast, low resource intensive algorithm to use? The only hash I know of is MD5 and I have a feeling that is over kill.
I also want to add that the strings are maybe 20 or 30 characters long at the max with most being around 7.
It depends. What's the hashing algorithm? How long are the strings? What's the platform?
Also note that a matching hash doesn't guarantee matching strings.
It's difficult to get ahead, string hashing functions are O(n). String comparison is O(n) as well, with a smaller Oh. You would only be ahead if you can store the hash values you compute and use them repeatedly. For both.
Simple sample C hash functions are here.
Another approach that could work, is to have your constant string sorted and making a dichotomic search of your string, this way you only have at most
log2(n)
comparisons (that's for example only 10 comparisons for 1024 strings or even only 20 for 1000000 strings). I don't know if it is applicable to your problem but I had really good results with that approach. Hashing is really difficult to get right, the corner cases can get really nasty and the computation of the key can often be quite costly.Is the comparison going to be done once or many times? If the comparison is going to be done only once then you are likely better off doing a straight comparison. If you are going to need to compare very many strings to this set of constant strings, then you can probably save time in the long run by doing it with hashes.
This is a simple enough problem that you can easily write it both ways and see which works better for a representative set of input.