I'm using a std::map
(VC++ implementation) and it's a little slow for lookups via the map's find method.
The key type is std::string
.
Can I increase the performance of this std::map
lookup via a custom key compare override for the map? For example, maybe std::string
< compare doesn't take into consideration a simple string::size()
compare before comparing its data?
Any other ideas to speed up the compare?
In my situation the map will always contain < 15 elements, but it is being queried non stop and performance is critical. Maybe there is a better data structure that I can use that would be faster?
Update: The map contains file paths.
Update2: The map's elements are changing often.
Where you have long common substrings, a trie might be a better data structure than a map or a hash_map. I said "might", though - a hash_map already only traverses the key once per lookup, so should be fairly fast. I won't discuss it further since others already have.
You could also consider a splay tree if some keys are more frequently looked up than others, but of course this makes the worst-case lookup worse than a balanced tree, and lookups are mutating operations, which may matter to you if you're using e.g. a reader-writer lock.
If you care about the performance of lookups more than modifications, you might do better with an AVL tree than a red-black, which I think is what STL implementations generally use for map. An AVL tree is typically better balanced and so will on average require fewer comparisons per lookup, but the difference is marginal.
Finding an implementation of these that you're happy with might be an issue. A search on the Boost main page suggests they have a splay and AVL tree but not a trie.
You mentioned in a comment that you never have a lookup that fails to find anything. So you could in theory skip the final comparison, which in a tree of 15 < 2^4 elements could give you something like a 20-25% speedup without doing anything else. In fact, maybe more than that, since equal strings are the slowest to compare. Whether it's worth writing your own container just for this optimisation is another question.
You might also consider locality of reference - I don't know whether you could avoid the occasional page miss by allocating the keys and the nodes out of a small heap. If you only need about 15 entries at a time, then assuming a file name limit below 256 bytes you could ensure that everything accessed during a lookup fits into a single 4k page (apart from the key being looked up, of course). It may be that comparing the strings is insignificant compared with a couple of page loads. However, if this is your bottleneck there must be an enormous number of lookups going on, so I'd guess that everything is reasonably close to the CPU. Worth checking, maybe.
Another thought: if you are using pessimistic locking on a structure where there's a lot of contention (you said in a comment the program is massively multi-threaded) then regardless of what the profiler tells you (what code the CPU cycles are spent in), it might be costing you more than you think by effectively limiting you to 1 core. Try a reader-writer lock?
Try std::tr1::unordered_map (found in the header <tr1/unordered_map>). This is a hash map, and, while it doesn't maintain a sorted order of elements, will likely be far faster than a regular map.
If your compiler doesn't support TR1, get a newer version. MSVC and gcc both support TR1, and I believe the newest versions of most other compilers also have support. Unfortunately, a lot of the library reference sites haven't been updated, so TR1 remains a largely-unknown piece of technology.
I hope C++0x isn't the same way.
EDIT: Note that the default hashing method for tr1::unordered_map is tr1::hash, which needs to be specialized to work on a UDT, probably.
std::map's comparator isn't std::equal_to it's std::less, I'm not sure what the best way to short circuit a < compare so that it would be faster than the built in one.
If there are always < 15 elems, perhaps you could use a key besides std::string?
The first thing is to try using a hash_map if that's possible - you are right that the standard string compare doesn't first check for size (since it compares lexicographically), but writing your own map code is something you'd be better off avoiding. From your question it sounds like you do not need to iterate over ranges; in that case map doesn't have anything hash_map doesn't.
It also depends on what sort of keys you have in your map. Are they typically very long? Also what does "a little slow" mean? If you have not profiled the code it's quite possible that it's a different part taking time.
Update: Hmm, the bottleneck in your program is a map::find, but the map always has less than 15 elements. This makes me suspect that the profile was somehow misleading, because a find on a map this small should not be slow, at all. In fact, a map::find should be so fast, just the overhead of profiling could be more than the find call itself. I have to ask again, are you sure this is really the bottleneck in your program? You say the strings are paths, but you're not doing any sort of OS calls, file system access, disk access in this loop? Any of those should be orders of magnitude slower than a map::find on a small map. Really any way of getting a string should be slower than the map::find.
Maybe you could reverse the strings prior to using them as keys in the map? That could help if the first few letters of each string are identical.
First, turn off all the profiling and DEBUG switches. These can slow down STL immensely.
If that's not it, part of the problem may be that your strings are identical for the first 80-90% of the string. This isn't bad for map, necessarily, but it is for string comparisons. If this is the case, your search can take much longer.
For example, in this code find() will likely result in a couple of string compares, but each will return after comparing the first character until "david", and then the first three characters will be checked. So at most, 5 characters will be checked per call.
On the other hand, in the following code, find() will likely check 135+ characters:
That's because the string comparisons have to search deeper to find a match since the beginning of each string is the same.
Using size() in your comparison for equality won't help you much here since your data set is so small. A std::map is kept sorted so its elements can be searched with a binary search. Each call to find should result in less than 5 string comparisons for a miss, and an average of 2 comparisons for a hit. But it does depend on your data. If most of your path strings are of different lengths, then a size check like Motti describes could help a lot.
Something to consider when thinking of alternative algorithms is how many many "hits" you get. Are most of your find() calls returning end() or a hit? If most of your find()s return end() (misses) then you are searching the entire map every time (2logn string compares).
Hash_map is a good idea; it should cut your search time in about half for hits; more for misses.
A custom algorithm may be called for because of the nature of path strings, especially if your data set has common ancestry like in the above code.
Another thing to consider is how you get your search strings. If you are reusing them, it may help to encode them into something that is easier to compare. If you use them once and discard them, then this encoding step is probably too expensive.
I used something like a Huffman coding tree once (a long time ago) to optimize string searches. A binary string search tree like that may be more efficient in some cases, but its pretty expensive for small sets like yours.
Finally, look into alternative std::map implementations. I've heard bad things about some of VC's stl code performance. The DEBUG library in particular is bad about checking you on every call. StlPort used to be a good alternative, but I haven't tried it in a few years. I've always loved Boost too.