The GNU ld (linker script) manual Section 3.5.5 Source Code Reference has some really important information on how to access linker script "variables" (which are actually just integer addresses) in C source code. I used this info. to extensively use linker script variables, and I wrote this answer here: How to get value of variable defined in ld linker script from C.
However, it is easy to do it wrong and make the mistake of trying to access a linker script variable's value (mistakenly) instead of its address, since this is a bit esoteric. The manual (link above) says:
This means that you cannot access the value of a linker script defined symbol - it has no value - all you can do is access the address of a linker script defined symbol.
Hence when you are using a linker script defined symbol in source code you should always take the address of the symbol, and never attempt to use its value.
The question: So, if you do attempt to access a linker script variable's value, is this "undefined behavior"?
Quick refresher:
Imagine in linker script (ex: STM32F103RBTx_FLASH.ld) you have:
/* Specify the memory areas */
MEMORY
{
FLASH (rx) : ORIGIN = 0x8000000, LENGTH = 128K
RAM (xrw) : ORIGIN = 0x20000000, LENGTH = 20K
}
/* Some custom variables (addresses) I intend to access from my C source code */
__flash_start__ = ORIGIN(FLASH);
__flash_end__ = ORIGIN(FLASH) + LENGTH(FLASH);
__ram_start__ = ORIGIN(RAM);
__ram_end__ = ORIGIIN(RAM) + LENGTH(RAM);
And in your C source code you do:
// 1. correct way A:
extern uint32_t __flash_start__;
printf("__flash_start__ addr = 0x%lX\n", (uint32_t)&__flash_start__);
// OR 2. correct way B (my preferred approach):
extern uint32_t __flash_start__[]; // not a true array; [] is required to access linker script variables (addresses) as though they were normal variables
printf("__flash_start__ addr = 0x%lX\n", (uint32_t)__flash_start__);
// OR 3. COMPLETELY WRONG WAY TO DO IT!
// - IS THIS UNDEFINED BEHAVIOR?
extern uint32_t __flash_start__;
printf("__flash_start__ addr = 0x%lX\n", __flash_start__);
Sample printed output
(this is real output: it was actually compiled, run, and printed by an STM32 mcu):
__flash_start__ addr = 0x8000000
__flash_start__ addr = 0x8000000
__flash_start__ addr = 0x20080000
<== NOTICE LIKE I SAID ABOVE: this one is completely wrong (even though it compiles and runs)!
Update:
Response to @Eric Postpischil's 1st comment:
The C standard does not define anything at all about linker script symbols. Any specification of behavior is up to the GNU tools. That said, if a linker script symbol identifies a place in memory where some valid object is stored, I would expect accessing the value of that object to work, if it were accessed with its proper type. Supposing flash_start is normally accessible memory, and except for any requirements of your system about what is at flash_start, you could, in theory, put a uint32_t (using appropriate input to the linker) and then access it via flash_start.
Yes, but that's not my question. I'm not sure if you're picking up the subtlety of my question. Take a look at the examples I provide. It is true you can access this location just fine, but make sure you understand how you do so, and then my question will become apparent. Look especially at example 3 above, which is wrong even though to a C programmer it looks right. To read a uint32_t
, for ex, at __flash_start__
, you'd do this:
extern uint32_t __flash_start__;
uint32_t u32 = *((uint32_t *)&__flash_start__); // correct, even though it *looks like* you're taking the address (&) of an address (__flash_start__)
OR this:
extern uint32_t __flash_start__[];
uint32_t u32 = *((uint32_t *)__flash_start__); // also correct, and my preferred way of doing it because it looks more correct to the trained "C-programmer" eye
But most definitely NOT this:
extern uint32_t __flash_start__;
uint32_t u32 = __flash_start__; // incorrect; <==UPDATE: THIS IS ALSO CORRECT! (and more straight-forward too, actually; see comment discussion under this question)
and NOT this:
extern uint32_t __flash_start__;
uint32_t u32 = *((uint32_t *)__flash_start__); // incorrect, but *looks* right
Related:
- Why do STM32 gcc linker scripts automatically discard all input sections from these standard libraries: libc.a, libm.a, libgcc.a?
- https://stackoverflow.com/a/54728097/4561887