c string compare vs hash compare

2019-04-11 05:24发布

I need to compare a string to multiple other constant strings in c. I am curious which is faster, to hash the string I am going to compare and compare it to all the other constant string hashes or just compare the strings as strings. thank you in advance

thank you for the answers I am going to be doing many comparisons. can anyone give me a good, fast, low resource intensive algorithm to use? The only hash I know of is MD5 and I have a feeling that is over kill.

I also want to add that the strings are maybe 20 or 30 characters long at the max with most being around 7.

10条回答
【Aperson】
2楼-- · 2019-04-11 06:04

It depends. What's the hashing algorithm? How long are the strings? What's the platform?

Also note that a matching hash doesn't guarantee matching strings.

查看更多
男人必须洒脱
3楼-- · 2019-04-11 06:05

It's difficult to get ahead, string hashing functions are O(n). String comparison is O(n) as well, with a smaller Oh. You would only be ahead if you can store the hash values you compute and use them repeatedly. For both.

Simple sample C hash functions are here.

查看更多
可以哭但决不认输i
4楼-- · 2019-04-11 06:09

Another approach that could work, is to have your constant string sorted and making a dichotomic search of your string, this way you only have at most log2(n) comparisons (that's for example only 10 comparisons for 1024 strings or even only 20 for 1000000 strings). I don't know if it is applicable to your problem but I had really good results with that approach. Hashing is really difficult to get right, the corner cases can get really nasty and the computation of the key can often be quite costly.

查看更多
ら.Afraid
5楼-- · 2019-04-11 06:10

Is the comparison going to be done once or many times? If the comparison is going to be done only once then you are likely better off doing a straight comparison. If you are going to need to compare very many strings to this set of constant strings, then you can probably save time in the long run by doing it with hashes.

This is a simple enough problem that you can easily write it both ways and see which works better for a representative set of input.

查看更多
登录 后发表回答