I have this extremely trivial piece of C code:
static int arr[];
int main(void) {
*arr = 4;
return 0;
}
I understand that the first statement is illegal (I've declared a file-scope array with static storage duration and file linkeage but no specified size), but why is it resulting in a linker error? :
/usr/bin/ld: /tmp/cch9lPwA.o: in function `main':
unit.c:(.text+0xd): undefined reference to `arr'
collect2: error: ld returned 1 exit status
Shouldn't the compiler be able to catch this before the linker?
It is also strange to me that, if I omit the static
storage class, the compiler simply assumes array is of length 1
and produces no error beyond that:
int arr[];
int main(void) {
*arr = 4;
return 0;
}
Results in:
unit.c:5:5: warning: array 'arr' assumed to have one element
int arr[];
Why does omitting the storage class result in different behavior here and why does the first piece of code produce a linker error? Thanks.
Empty arrays static int arr[];
and zero-length arrays static int arr[0];
were gcc non-standard extensions.
The intention of these extensions were to act as a fix for the old "struct hack". Back in the C90 days, people wrote code such as this:
typedef struct
{
header stuff;
...
int data[1]; // the "struct hack"
} protocol;
where data
would then be used as if it had variable size beyond the array depending on what's in the header part. Such code was buggy, wrote data to padding bytes and invoked array out-of-bounds undefined behavior in general.
gcc fixed this problem by adding empty/zero arrays as a compiler extension, making the code behave without bugs, although it was no longer portable.
The C standard committee recognized that this gcc feature was useful, so they added flexible array members to the C language in 1999. Since then, the gcc feature is to be regarded as obsolete, as using the C standard flexible array member is to prefer.
As recognized by the linked gcc documentation:
Declaring zero-length arrays in other contexts, including as interior members of structure objects or as non-member objects, is discouraged.
And this is what your code does.
Note that gcc with no compiler options passed defaults to -std=gnu90
(gcc < 5.0) or -std=gnu11
(gcc > 5.0). This gives you all the non-standard extensions enabled, so the program compiles but does not link.
If you want standard compliant behavior, you must compile as
gcc -std=c11 -pedantic-errors
The -pedantic
flag disables gcc extensions, and the linker error switches to a compiler error as expected. For an empty array as in your case, you get:
error: array size missing in 'arr'
And for a zero-length array you get:
error: ISO C forbids zero-size array 'arr' [-Wpedantic]
The reason why int arr[]
works, is because this is an array declaration of tentative definition with external linkage (see C17 6.9.2). It is valid C and can be regarded as a forward declaration. It means that elsewhere in the code, the compiler (or rather the linker) should expect to find for example int arr[10]
, which is then referring to the same variable. This way, arr
can be used in the code before the size is known. (I wouldn't recommend using this language feature, as it is a form of "spaghetti programming".)
When you use static
you block the possibility to have the array size specified elsewhere, by forcing the variable to have internal linkage instead.
Maybe one reason for this behavior is that the compiler issues a warning resulting in a non-accessed static
variable and optimizes it away - the linker will complain!
If it is not static, it cannot simply be ignored, because other modules might reference it - so the linker can at least find that symbol arr
.