I was hanging out in my profiler for a while trying to figure out how to speed up a common log parser which was bottlenecked around the date parsing, and I tried various algorithms to speed things up.
The thing I tried that was fastest for me was also by far the most readable, but potentially non-standard C.
This worked quite well in GCC, icc, and my really old and picky SGI compiler. As it's a quite readable optimization, where doesn't it do what I want?
static int parseMonth(const char *input) {
int rv=-1;
int inputInt=0;
int i=0;
for(i=0; i<4 && input[i]; i++) {
inputInt = (inputInt << 8) | input[i];
}
switch(inputInt) {
case 'Jan/': rv=0; break;
case 'Feb/': rv=1; break;
case 'Mar/': rv=2; break;
case 'Apr/': rv=3; break;
case 'May/': rv=4; break;
case 'Jun/': rv=5; break;
case 'Jul/': rv=6; break;
case 'Aug/': rv=7; break;
case 'Sep/': rv=8; break;
case 'Oct/': rv=9; break;
case 'Nov/': rv=10; break;
case 'Dec/': rv=11; break;
}
return rv;
}
Comeau compiler
You're just computing a hash of those four characters. Why not predefine some integer constants that compute the hash in the same way and use those? Same readability and you're not depending on any implementation specific idiosyncrasies of the compiler.
There are at least 3 things that keep this program from being portable:
char
sincechar
is by definition one byte long; your program will not function properly on such systems.int
is only 16-bits (which is the smallest size allowed for int) including embedded devices and legacy machines, your program will fail on these machines as well.Machine word size issues aside, your compiler may promote input[i] to a negative integer which will just set the upper bits of inputInt with or operation, so I suggest you to be explicit about signedness of char variables.
But since in US, no one cares about the 8th bit, it is probably a non-issue for you.
Slightly less readable and not so much validating, but perhaps even faster, no?