I know this sounds like a general question and I've seen many similar questions (both here and on the web) but none of them are really like my dilemma.
Say I have this code:
void GetSomeData(char* buffer)
{
// put some data in buffer
}
int main()
{
char buffer[1024];
while(1)
{
GetSomeData(buffer);
// do something with the data
}
return 0;
}
Would I gain any performance if I declared buffer[1024] globally?
I ran some tests on unix via the time command and there are virtually no differences between the execution times.
But I'm not really convinced...
In theory should this change make a difference?
Your question doesn't really have an answer; it depends on what else you are doing. Generally speaking, most machines use the same "memory" structure over the entire process, so regardless of where (heap, stack or global memory) the variable resides, access time will be identical. On the other hand, most modern machines have a hierarchial memory structure, with a memory pipeline, several levels of cache, main memory, and virtual memory. Depending on what has gone on previously on the processor, the actual access may be to any one of these (regardless of whether it is heap, stack or global), and the access times here vary enormously, from a single clock if the memory is in the right place in the pipeline, to something around 10 milliseconds if the system has to go to virtual memory on disk.
In all cases, the key is locality. If an access is "near" a previous access, you greatly improve the chance of finding it in one of the faster locations: cache, for example. In this regard, putting smaller objects on the stack may be faster, because when you access the arguments of a function, you're access on stack memory (with an Intel 32-bit processor, at least---with better designed processors, arguments are more likely to be in registers). But this will probably not be an issue when an array is involved.
when allocating buffers on stack the optimization scope is not the cost of accessing the memory but rather the elimination of often very expensive dynamic memory allocation on the heap (stack buffer allocation can be considered instantaneous as the stack as a whole is allocated at thread startup).
Quoting from Jeff Hill's answer:
Giving that variables and variable arrays that are declared on the heap is slower is just a fact. Think about it this way;
Globally created variables are allocated once and deallocated once the program is closing. For a heap object your variable has to be allocated on the spot each time the function is ran, and deallocated in the end of the function..
Ever tried allocating an object pointer within a function? Well better free / delete it before the function exits, orelse you will have yourself a memoryleak giving that you are not doing this in a class object where it is free'd/deleted inside the deconstructor.
When it comes to accessing of an array they all work the same, a memory block is first allocated by sizeof(DataType) * elements. Later can be accessed by ->
Not inherently... on every architecture I've ever worked on, all the process "memory" can be expected to operate at the same set of speeds, based on which level of CPU cache / RAM / swap file is holding the current data, and any hardware-level synchronisation delays that operations on that memory may trigger to make it visible to other processes, incorporate other processes'/CPU (core)'s changes etc..
The OS (which is responsible for page faulting / swapping), and the hardware (CPU) trapping on accesses to swapped-out or not-yet-accessed pages, would not even be tracking which pages are "stack" vs "heap"... a memory page is a memory page. That said, the virtual address of global data may be able to be calculated and hardcoded at compile time, the addresses of stack-based data are typically stack-pointer relative, while memory on the heap must almost always be accessed using pointers, which might be slightly slower on some systems - it depends on the CPU addressing modes and cycles, but it's almost always insignificant - not even worth a look or second thought unless you're writing something where millionths of a second are enormously important.
Anyway, in your example you're contrasting a global variable with a function-local (stack/automatic) variable... there's no heap involved. Heap memory comes from
new
ormalloc
/realloc
. For heap memory, the performance issue worth noting is that the application itself is keeping track of how much memory is in use at which addresses - the records of all that take some time to update as pointers to memory are handed out bynew
/malloc
/realloc
, and some more time to update as the pointers aredelete
d orfree
d.For global variables, the allocation of memory may effectively be done at compile time, while for stack based variables there's normally a stack pointer that's incremented by the compile-time-calculated sum of the sizes of local variables (and some housekeeping data) each time a function is called. So, when
main()
is called there may be some time to modify the stack pointer, but it's probably just being modified by a different amount rather than not modified if there's nobuffer
and modified if there is, so there's no difference in runtime performance at all.