I read here that Intel introduced SSE 4.2 instructions
for accelerating string processing.
Quote from the article:
The SSE 4.2 instruction set, first implemented in Intel's Core i7,
provides string and text processing instructions (STTNI) that utilize
SIMD operations for processing character data. Though originally
conceived for accelerating string, text, and XML processing, the
powerful new capabilities of these instructions are useful outside of
these domains, and it is worth revisiting the search and recognition
stages of numerous applications to utilize STTNI to improve
performance
- Does gcc make use of these instructions if they are available?
- If so, which version?
- If it doesn't, are there any open source libraries
which offer this?
In regards to software libraries I would look at Agner Fog's asmlib. It has a collection of many routines, including several string manipulation ones which use SSE4.2, optimized in assembly. Some other useful functions it provides which I use return information on the CPU such as the cache size for each level and which extensions (e.g. SSE4.2) are supported.
http://www.agner.org/optimize/asmlib.zip
To enable SSE4.2 in GCC compile with -msse4.2 or if you have a processor with AVX use -mavx
I'm not sure about whether gcc uses that, but it shouldn't matter as text processing is generally done through glibc. If you use the standard string functions from string.h (probably cstring will do the same), and have a reasonable glibc you should be using them automatically.
I have searched for it and it seems glibc 2.15 (possibly even older ones have it) already has SSE4.2 strcasecmp optimizations:
http://upstream.rosalinux.ru/changelogs/glibc/2.15/changelog.html