I'm looking for a way to wrap stack allocations in abstract data types. For example, I'd like to have a vector which can work strictly via allocations on the stack. My biggest hurdle of course is that alloca
works only within the current stack frame -- thus I don't see an easy way to wrap this into a function.
So far the only way I see to do this is by using macro-like functions which are guaranteed to be compiled into a given stack frame. I don't like this approach since it isn't as type friendly as one would hope, and requires more verbose naming than desired.
Is there anyway I can get a function to allocate on its caller stack? I understand this would normally destroy the immediately calling stack, thus likely the function would also have to be forced inline somehow. I'm not clear on what options I have, so I'm looking for some ideas, or pointers towards possible options.
Notes:
The ultimate goal is something like a std::vector
which works strictly on the immediate functions stack. Obviously it would only be passed as a const
object to callees, and its life ends with the function.
C approach is fine so long as it is better than my macro based approach. Though some support macros are also acceptable.
I understand this is a fairly specific optimization, and optimally I'd like to be able to (with a flag) turn it on/off (using just an normal std::vector for debugging). It would give a minor speed boost to significant parts of our code, but probably not enough to justify making it unreadable via too many odd constructs.
Answer: Is most likely that it isn't possible and that only the macro approach would work.
You can't.
When a function returns, its stack is unwound, and the stack pointer goes back where it was before. It has to, if you don't want a real mess. All alloca does is move the stack pointer, so a function return undoes this allocation.
Macros would work, because they just add code to the same function. But it would be ugly, with no real hope of improvement.
The main benefit of using stack allocation is probably the by-pass of the standard library malloc/new allocator. In this light, using the stack is not the only option.
One alternative to stack allocation is using a custom memory allocator based on mmap()
system call. The memory allocated by mmap()
can be used as storage for the custom allocator instead of the stack. To avoid calling mmap()
often the memory region allocated by mmap()
should be cached, in a global thread-specific variable, for example.
My biggest hurdle of course is that alloca works only within the current stack frame -- thus I don't see an easy way to wrap this into a function.
Well, if you only need to allocate once (that is, if you have a maximum capacity that will always be sufficient), you can call alloca
inside a default argument:
template<typename T, size_t Capacity>
class stack_vector
{
T* start_;
size_t size_;
public:
explicit stack_vector(void* memory = alloca(sizeof(T) * Capacity))
{
start_ = static_cast<T*>(memory);
size_ = 0;
}
};
You can always implement your own custom allocator that will be as efficient as thread stack. From my experience alloca can be very dangerous, and if hidden in some C++ class hierarchies (ie. in for loop) can easily blow your stack.
Here's an example macro for allocating an on-stack array, which takes advantage of C++ type-safety and compile-time checking as much as possible via a helper inline function:
#include <type_traits>
#include <alloca.h>
namespace Utils {
// A wrapper for alloca which allocates an array of default-constructible, trivially-destructible objects on the stack
#define ALLOC_ON_STACK_ARRAY(T, nMembers) \
::Utils::InitArrayOfTriviallyDestructibles<T>(alloca(sizeof(T) * nMembers), size_t(nMembers))
// Helper function for ALLOC_ON_STACK_ARRAY() defined above. Initialize a block of memory as an array.
template <typename T>
inline T* InitArrayOfTriviallyDestructibles(void* p, size_t nMembers)
{
static_assert(std::is_trivially_destructible<T>::value, "The type is not trivially destructible");
return new (p) T[nMembers] {};
}
} // namespace Utils
The stack is really not a good fit for the kind of allocations that a container class uses. For example, when a vector needs to expand its capacity
, it most likely allocates a new region of storage, copies the existing items over (hence the C++ requirements on copy and default constructors for objects used in containers), and releases the original storage. Needless to say, this plays havoc with stack-based storage, which cannot be released until the function exits. (The only way vector
could expand capacity
in-place without copying is using the realloc
function, which has no C++ equivalent, and most importantly to you, no alloca
equivalent.)
Further, alloca
only really works with POD-types in C++, which containers most definitely are not.
EDIT: The answer to this question partly solves the problem: It allocates the initial storage for the vector
from the stack, but if further capacity is needed, then it allocates from the heap.