Suppose I have a function in a single threaded program that looks like this
void f(some arguments){
char buffer[32];
some operations on buffer;
}
and f appears inside some loop that gets called often, so I'd like to make it as fast as possible. It looks to me like the buffer needs to get allocated every time f is called, but if I declare it to be static, this wouldn't happen. Is that correct reasoning? Is that a free speed up? And just because of that fact (that it's an easy speed up), does an optimizing compiler already do something like this for me?
If there are any local automatic variables in the function at all, the stack pointer needs to be adjusted. The time taken for the adjustment is constant, and will not vary based on the number of variables declared. You might save some time if your function is left with no local automatic variables whatsoever.
If a static variable is initialized, there will be a flag somewhere to determine if the variable has already been initialized. Checking the flag will take some time. In your example the variable is not initialized, so this part can be ignored.
Static variables should be avoided if your function has any chance of being called recursively or from two different threads.
Note that block-level
static
variables in C++ (as opposed to C) are initialized on first use. This implies that you'll be introducing the cost of an extra runtime check. The branch potentially could end up making performance worse, not better. (But really, you should profile, as others have mentioned.)Regardless, I don't think it's worth it, especially since you'd be intentionally sacrificing re-entrancy.
Depending on what exactly the variable is doing and how its used, the speed up is almost nothing to nothing. Because (on x86 systems) stack memory is allocated for all local vars at the same time with a simple single func(sub esp,amount), thus having just one other stack var eliminates any gain. the only exception to this is really huge buffers in which case a compiler might stick in _chkstk to alloc memory(but if your buffer is that big you should re-evaluate your code). The compiler cannot turn stack memory into static memory via optimization, as it cannot assume that the function is going to be used in a single threaded enviroment, plus it would mess with object constructors & destructors etc
With gcc, I do see some speedup:
And the time:
changing buffer to static:
No, it's not a free speedup.
First, the allocation is almost free to begin with (since it consists merely of adding 32 to the stack pointer), and secondly, there are at least two reasons why a static variable might be slower
So it's not a free speedup. But it is possible that it is faster in your case (although I doubt it). So try it out, benchmark it, and see what works best in your particular scenario.
It will make the function substantially slower on most real cases. This is because the static data segment is not near the stack and you will lose cache coherency, so you will get a cache miss when you try to access it. However when you allocate a regular char[32] on the stack, it is right next to all your other needed data and costs very little to access. The initialization costs of a stack-based array of char are meaningless.
This is ignoring that statics have many other problems.
You really need to actually profile your code and see where the slowdowns are, because no profiler will tell you that allocating a statically-sized buffer of characters is a performance problem.