This question may sound fairly elementary, but this is a debate I had with another developer I work with.
I was taking care to stack allocate things where I could, instead of heap allocating them. He was talking to me and watching over my shoulder and commented that it wasn't necessary because they are the same performance wise.
I was always under the impression that growing the stack was constant time, and heap allocation's performance depended on the current complexity of the heap for both allocation (finding a hole of the proper size) and de-allocating (collapsing holes to reduce fragmentation, as many standard library implementations take time to do this during deletes if I am not mistaken).
This strikes me as something that would probably be very compiler dependent. For this project in particular I am using a Metrowerks compiler for the PPC architecture. Insight on this combination would be most helpful, but in general, for GCC, and MSVC++, what is the case? Is heap allocation not as high performing as stack allocation? Is there no difference? Or are the differences so minute it becomes pointless micro-optimization.
As others have said, stack allocation is generally much faster.
However, if your objects are expensive to copy, allocating on the stack may lead to an huge performance hit later when you use the objects if you aren't careful.
For example, if you allocate something on the stack, and then put it into a container, it would have been better to allocate on the heap and store the pointer in the container (e.g. with a std::shared_ptr<>). The same thing is true if you are passing or returning objects by value, and other similar scenarios.
The point is that although stack allocation is usually better than heap allocation in many cases, sometimes if you go out of your way to stack allocate when it doesn't best fit the model of computation, it can cause more problems than it solves.
Usually stack allocation just consists of subtracting from the stack pointer register. This is tons faster than searching a heap.
Sometimes stack allocation requires adding a page(s) of virtual memory. Adding a new page of zeroed memory doesn't require reading a page from disk, so usually this is still going to be tons faster than searching a heap (especially if part of the heap was paged out too). In a rare situation, and you could construct such an example, enough space just happens to be available in part of the heap which is already in RAM, but allocating a new page for the stack has to wait for some other page to get written out to disk. In that rare situation, the heap is faster.
I'd like to say actually code generate by GCC (I remember VS also) doesn't have overhead to do stack allocation.
Say for following function:
Following is the code generate:
So whatevery how much local variable you have (even inside if or switch), just the 3880 will change to another value. Unless you didn't have local variable, this instruction just need to execute. So allocate local variable doesn't have overhead.
I don't think stack allocation and heap allocation are generally interchangable. I also hope that the performance of both of them is sufficient for general use.
I'd strongly recommend for small items, whichever one is more suitable to the scope of the allocation. For large items, the heap is probably necessary.
On 32-bit operating systems that have multiple threads, stack is often rather limited (albeit typically to at least a few mb), because the address space needs to be carved up and sooner or later one thread stack will run into another. On single threaded systems (Linux glibc single threaded anyway) the limitation is much less because the stack can just grow and grow.
On 64-bit operating systems there is enough address space to make thread stacks quite large.
Stack allocation is a couple instructions whereas the fastest rtos heap allocator known to me (TLSF) uses on average on the order of 150 instructions. Also stack allocations don't require a lock because they use thread local storage which is another huge performance win. So stack allocations can be 2-3 orders of magnitude faster depending on how heavily multithreaded your environment is.
In general heap allocation is your last resort if you care about performance. A viable in-between option can be a fixed pool allocator which is also only a couple instructions and has very little per-allocation overhead so it's great for small fixed size objects. On the downside it only works with fixed size objects, is not inherently thread safe and has block fragmentation problems.
There's a general point to be made about such optimizations.
The optimization you get is proportional to the amount of time the program counter is actually in that code.
If you sample the program counter, you will find out where it spends its time, and that is usually in a tiny part of the code, and often in library routines you have no control over.
Only if you find it spending much time in the heap-allocation of your objects will it be noticeably faster to stack-allocate them.