Writing code in C, never formally learned any of it, using GNU's GSL library, quick fundamental question.
Correct me if I'm wrong, but the way I understand it, when I allocate memory to use for my matrices (using the built-in var = gsl_matrix_alloc(x,x)
) and store them in a variable, I'm essentially creating
a pointer, which is simply some address of memory, like:
x01234749162
which POINTS to the first pointer/memory location of my GSL matrix. Tracking when to deallocate the memory of the structure associated with a pointer (again, built-in gsl_matrix_free(x,x,x)
) is no problem, and I understand I need to do this before I reassign the pointer of the structure, otherwise I've created a memory leak.
So now to my question, and again, I know this is basic, but hear me out--I couldn't find a particularly straight answer on stackoverflow, mostly because a lot of the answers involve C++ rather than C--how do I free the pointer to the structure itself?
Everyone says "oh, just set it to NULL". Why would that work? That's just changing the memory address that POINTS to the deallocated structure. Does that tell the MMU that that memory location is okay to use now? When I debug my program in XCode, for example, all of the properties of the gsl_matrix structure is successfully deallocated; everything just becomes this garbage string of random hex characters, which is what free'ing memory is supposed to do. But, I can still see the variable name (pointer) while stepping through the debugger... even if I set the variable to NULL. I would interpret that as meaning I did not free the pointer, I just freed the structure and set it to x0000000 (NULL).
Am I doing everything correct, and this is just a feature of XCode, or am I missing something basic?
And I realize that a single pointer to a structure, if the structure is deallocated, could be argued to not be a big deal, but it matters.
Here's some code to try to illustrate my thoughts.
gsl_matrix* my_matrix;
// create single memory address in memory, not pointing to anything yet
my_matrix = gsl_matrix_alloc(5, 5);
// allocates 25 memory spaces for the values that the pointer held by my_matrix
// points too
// Note: so, now there's 26 memory spots allocated to the matrix, excluding other
// properties created along with the my-matrix structure, right?
gsl_matrix_free(my_matrix); // deallocates those 25 spaces the structure had,
// along with other properties that may have been automatically created
free(my_matrix); // SIGBRT error. Is the pointer to the deallocated structure
// still using that one memory address?
my_matrix = NULL; // this doesn't make sense to me.I get that any future referral
// to the my_matrix pointer will just return garbage, and so setting a pointer to
// that can help in debugging, but can the pointer--that is just one memory
// address--be completely deallocated such that in the debugger the variable name
// disappears?
Everyone says "oh, just set it to NULL". Why would that work?
They probably mean that this would fix the problem where you are calling free
on a pointer to some data that has already been de-allocated, which is what you are doing here:
gsl_matrix_free(my_matrix); // deallocate
free(my_matrix); // Mistake, BIG PROBLEM: my_matrix points to de-allocated data
It fixes the problem because calling free
on a null-ptr is a no op:
gsl_matrix_free(my_matrix); // deallocate
my_matrix = NULL;
free(my_matrix); // Mistake, but no problem
Note: my_matrix
itself has automatic storage, so there is no need to de-allocate it manually. Its memory will be reclaimed when it goes out of scope. The only thing that needs to be de-allocated is the memory that was dynamically allocated (and to which my_matrix
points.)
What you miss here is the knowledge of how "local variables" work at the machine level and the concept of "the stack".
The stack is a block of free memory that is allocated for your program when it starts. Suppose, for the sake of a simple example, that your program is allocated a stack of size 1MB. The stack is accompanied with a special register, called "stack pointer", which initially points to the end of the stack (don't ask why not the beginning, historical reasons). Here's how it looks:
[---------- stack memory, all yours for taking ------------]
^
|
Stack pointer
Now suppose your program defines a bunch of variables in the main
function, i.e. something like
int main() {
int x;
What this means is that when the main
function is invoked at the start of your program, the compiler will generate the following instructions:
sp = sp - 4; // Decrement stack pointer
x_address = sp;
and remember (for purposes of further compilation) that x
is now a 4-byte integer located at memory position x_address
. Your stack now looks as follows:
[---------- stack memory, all yours for taking --------[-x--]
^
|
Stack pointer
Next, suppose you invoke some function f
from within the main. Suppose f
defines inside it another variable,
int f() {
char z[8];
Guess what happens now? Before entering f
the compiler will perform:
sp = sp - 8;
z_address = sp;
I.e. you'll get:
[---------- stack memory, all yours for taking -[--z----][-x--]
^
|
Stack pointer
If you now invoke another function, the stack pointer will move deeper into the stack, "creating" more space for the local variables. Each time when you exit a function, though, the stack pointer is restored back to where it was before the function was invoked. E.g. after you exit f
, your stack will be looking as follows:
[---------- stack memory, all yours for taking -[--z----][-x--]
^
|
Stack pointer
Note that the z
array was not essentially freed, it is still there on the stack, but you do not care. Why don't you care? Because the whole stack is automatically deallocated when your application terminates. This is the reason why you do not need to manually deallocate "variables on the stack", i.e. those which are defined as local to your functions and modules. In particular, your my_matrix
pointer is just a yet another variable like that.
PS: there is a bit more happening on the stack than I described. In particular, the stack pointer value is stored on the stack before decrementing it, so that it can be restored after exiting the function. In addition, function arguments are often passed by putting them onto the stack. In this sense they look like local variables for purposes of memory management and you don't need to free them.
PPS: In principle, the compiler is free to optimize your code (especially if you compile with the -O
flag) and rather than allocate your local variables on the stack it may:
- Decide to avoid allocating them at all (e.g. if they turn out to be useless)
- Decide to allocate them temporarily in the registers (which are fixed memory slots in the processor that do not need to be freed). This is often done for loop variables (the ones within
for (int i = ...)
).
- .. and well, do whatever else comes to his twisted mind as long as the result does not contradict the semantics.
PPPS: Now you are prepared to learn how buffer overflow works. Go read about it, really, it is an amazing trick. Oh, and, once you're at it, check out the meaning of stack overflow. ;)
Why would 26 memory spots be allocated for a 5x5 matrix? I'd say trust the library-provided gsl_matrix_free
function to do the right thing and deallocate the whole structure.
In general, you only need to call free
if you called malloc
or calloc
. Library functions that provide an allocator usually provide a matching deallocator so that you don't have to keep track of the internals.
If the 26th spot you're worried about is the pointer itself (in other words, the memory needed to store the address of the matrix), that space is part of the stack frame for your function, and it is automatically popped when the function returns.