This question already has an answer here:
I'd like to ask if is it portable to rely on string literal address across translation units? I.e:
A given file foo.c
has a reference to a string literal "I'm a literal!"
, is it correct and portable to rely that in other given file, bar.c
in instance, that the same string literal "I'm a literal!"
will have the same memory address? Considering that each file will be translated to a individual .o
file.
For better illustration, follows an example code:
# File foo.c
/* ... */
const char * x = "I'm a literal!"
# File bar.c
/* ... */
const char * y = "I'm a literal!"
# File test.c
/* ... */
extern const char * x;
extern const char * y;
assert (x == y); //Is this assertion going to fail?
And a gcc example command lines:
gcc -c -o foo.o -Wall foo.c
gcc -c -o bar.o -Wall bar.c
gcc -c -o test.o -Wall test.c
gcc -o test foo.o bar.o test.o
What about in the same translation unit? Would this be reliable if the strings literals are in the same translation unit?
You can not rely on identical string literals having the same memory location, it is an implementation decision. The C99 draft standard tells us that it is unspecified whether the same string literal are distinct, from section
6.4.5
String literals:For C++ this covered in the draft standard section
2.14.5
String literals which says:The compiler is allowed to pool string literals but you would have to understand how it works from compiler to compiler and so this would not be portable and could potentially change. Visual Studio includes an option for string literal pooling
Note that it does qualify with In some cases.
gcc
does support pooling and across compilation units and you can turn it on via -fmerge-constants:note, the use of attempt and if ... support it.
As for a rationale at least for C for not requiring string literals to be pooled we can see from this archived comp.std.c discussion on string literals that the rationale was due to the wide variety of implementation at the time:
No, you can't expect the same address. If it happens, happens. But there's nothing enforcing it.
§ 2.14.5/p12
The compiler can do as it pleases. They can be stored in different addresses if they are in different translation units or even if they are in the same translation unit, regardless of the fact that they're read-only memory.
On MSVC, for instance, the addresses are totally different in both cases, but again: nothing prevents the compiler from merging the pointers' values (not even where, as far as the read-only section constraint is obliged).