I'm currently working on an embedded project (STM32F103RB, CooCox CoIDE v.1.7.6 with arm-none-eabi-gcc 4.8 2013q4) and I'm trying to understand how malloc()
behaves on plain C
when the RAM is full.
My STM32 has 20kB = 0x5000Bytes of RAM, 0x200 are used for the stack.
#include <stdlib.h>
#include "stm32f10x.h"
struct list_el {
char weight[1024];
};
typedef struct list_el item;
int main(void)
{
item * curr;
// allocate until RAM is full
do {
curr = (item *)malloc(sizeof(item));
} while (curr != NULL);
// I know, free() is missing. Program is supposed to crash
return 0;
}
I would expect malloc()
to return NULL
as soon as the heap is too small for allocating:
0x5000
(RAM) - 0x83C
(bss) - 0x200
(stack) = 0x45C4
(heap)
So when executing the malloc()
for the 18th time. One item is 1024=0x400
Bytes large.
But instead the uC calls the HardFault_Handler(void)
after the 18th time (not even the MemManager_Handler(void)
)
Does anybody have an advice how to forecast a malloc()
failure - since waiting for a NULL
return doesn't seem to work.
Thank you.
Using standard
c malloc
it's very hard to distinguish andmalloc
is seems buggy from my view. So you can manage memory by implementing some custommalloc
using your RAM address.I am not sure may this help you but i have done some custom
malloc
in my controller related project it's as followsThis basically macro defines for RAM address and have manually chose more block number for block size which frequently require to allocate,Like 36 bytes required me more so i take more number for it.
This is init function for mem init
This one for allocation
This one for free
After all you can use above function like
Then can also watch your used memory as follows
In general have pre-calculated the memory first then give as i have.
Your program most likely crashes because of an illegal memory access, which is almost always an indirect (subsequent) result of a legal memory access, but one that you did not intend to perform.
For example (which is also my guess as to what's happening on your system):
Your heap most likely begins right after the stack. Now, suppose you have a stack-overflow in
main
. Then one of the operations that you perform inmain
, which is naturally a legal operation as far as you're concerned, overrides the beginning of the heap with some "junk" data.As a subsequent result, the next time that you attempt to allocate memory from the heap, the pointer to the next available chunk of memory is no longer valid, eventually leading to a memory access violation.
So to begin with, I strongly recommend that you increase the stack size from 0x200 bytes to 0x400 bytes. This is typically defined within the linker-command file, or through the IDE, in the project's linker settings.
If your project is on IAR, then you can change it in the
icf
file:Other than that, I suggest that you add code in your
HardFault_Handler
, in order to reconstruct the call-stack and register values prior to the crash. This might allow you to trace the runtime error and find out exactly where it happened.In file 'startup_stm32f03xx.s', make sure that you have the following piece of code:
Then, in the same file, add the following interrupt handler (where all other handlers are located):
Then, in file 'stm32f03xx.c', add the following ISR:
If you can't use
printf
at the point in the execution when this specific Hard-Fault interrupt occurs, then save all the above data in a global buffer instead, so you can view it after reaching thewhile (1)
.Then, refer to the 'Cortex-M Fault Exceptions and Registers' section at http://www.keil.com/appnotes/files/apnt209.pdf in order to understand the problem, or publish the output here if you want further assistance.
UPDATE:
In addition to all of the above, make sure that the base address of the heap is defined correctly. It is possibly hard-coded within the project settings (typically right after the data-section and the stack). But it can also be determined during runtime, at the initialization phase of your program. In general, you need to check the base addresses of the data-section and the stack of your program (in the map file created after building the project), and make sure that the heap does not overlap either one of them.
I once had a case where the base address of the heap was set to a constant address, which was fine to begin with. But then I gradually increased the size of the data-section, by adding global variables to the program. The stack was located right after the data-section, and it "moved forward" as the data-section grew larger, so there were no problems with either one of them. But eventually, the heap was allocated "on top of" part of the stack. So at some point, heap-operations began to override variables on the stack, and stack-operations began to override the contents of the heap.
It does not look like
malloc
is doing any checks at all. The fault that you get comes from hardware detecting a write to an invalid address, which is probably coming frommalloc
itself.When
malloc
allocates memory, it takes a chunk from its internal pool, and returns it to you. However, it needs to store some information for thefree
function to be able to complete deallocation. Usually, that's the actual length of the chunk. In order to save that information,malloc
takes a few bytes from the beginning of the chunk itself, writes the info there, and returns you the address past the spot where it has written its own information.For example, let's say you asked for a 10-byte chunk.
malloc
would grab an available 16-byte chunk, say, at addresses0x3200..0x320F
, write the length (i.e. 16) into bytes 1 and 2, and return0x3202
back to you. Now your program can use ten bytes from0x3202
to0x320B
. The other four bytes are available, too - if you callrealloc
and ask for 14 bytes, there would be no reallocation.The crucial point comes when
malloc
writes the length into the chunk of memory that it is about to return to you: the address to which it writes needs to be valid. It appears that after the 18-th iteration the address of the next chunk is negative (which translates to a very large positive) so CPU traps the write, and triggers the hard fault.In situations when the heap and the stack grow toward each other there is no reliable way to detect an out of memory while letting you use every last byte of memory, which is often a very desirable thing.
malloc
cannot predict how much stack you are going to use after the allocation, so it does not even try. That is why the byte counting in most cases is on you.In general, on embedded hardware when the space is limited to a few dozen kilobytes, you avoid
malloc
calls in "arbitrary" places. Instead, you pre-allocate all your memory upfront using some pre-calculated limits, and parcel it out to structures that need it, and never callmalloc
again.