Why does the C preprocessor in GCC interpret the word linux
(small letters) as the constant 1
?
test.c:
#include <stdio.h>
int main(void)
{
int linux = 5;
return 0;
}
Result of $ gcc -E test.c
(stop after the preprocessing stage):
....
int main(void)
{
int 1 = 5;
return 0;
}
Which -of course- yields an error.
(BTW: There is no #define linux
in the stdio.h file.)
Use this command
to get this
From
info gcc
(emphasis mine):(It uses vax in the example instead of linux because when it was written maybe it was more popular ;-).
The basic idea is that GCC only tries to fully comply with the ISO standards when it is invoked with the
-ansi
option.This appears to be an (undocumented) "GNU extension": [correction: I finally found a mention in the docs. See below.]
The following command uses the
-dM
option to print all preprocessor defines; since the input "file" is empty, it shows exactly the predefined macros. It was run with gcc-4.7.3 on a standard ubuntu install. You can see that the preprocessor is standard-aware. In total, there 243 macros with-std=gnu99
and 240 with-std=c99
; I filtered the output for relevance.The "gnu standard" versions also
#define unix
. (Usingc11
andgnu11
produces the same results.)I suppose they had their reasons, but it seems to me to make the default installation of gcc (which compiles C code with
-std=gnu89
unless otherwise specified) non-conformant, and -- as in this question -- surprising. Polluting the global namespace with macros whose names don't begin with an underscore is not permitted in a conformant implementation. (6.8.10p2: "Any other predefined macro names shall begin with a leading underscore followed by an uppercase letter or a second underscore," but, as mentioned in Appendix J.5 (portability issues), such names are often predefined.)When I originally wrote this answer, I wasn't able to find any documentation in gcc about this issue, but I did finally discover it, not in C implementation-defined behaviour nor in C extensions but in the
cpp
manual section 3.7.3, where it notes that:In the Old Days (pre-ANSI), predefining symbols such as
unix
andvax
was a way to allow code to detect at compile time what system it was being compiled for. There was no official language standard back then (beyond the reference material at the back of the first edition of K&R), and C code of any complexity was typically a complex maze of#ifdef
s to allow for differences between systems. These macro definitions were generally set by the compiler itself, not defined in a library header file. Since there were no real rules about which identifiers could be used by the implementation and which were reserved for programmers, compiler writers felt free to use simple names likeunix
and assumed that programmers would simply avoid using those names for their own purposes.The 1989 ANSI C standard introduced rules restricting what symbols an implementation could legally predefine. A macro predefined by the compiler could only have a name starting with two underscores, or with an underscore followed by an uppercase letter, leaving programmers free to use identifiers not matching that pattern and not used in the standard library.
As a result, any compiler that predefines
unix
orlinux
is non-conforming, since it will fail to compile perfectly legal code that uses something likeint linux = 5;
.As it happens, gcc is non-conforming by default -- but it can be made to conform (reasonably well) with the right command-line options:
See the gcc manual for more details.
gcc will be phasing out these definitions in future releases, so you shouldn't write code that depends on them. If your program needs to know whether it's being compiled for a Linux target or not it can check whether
__linux__
is defined (assuming you're using gcc or a compiler that's compatible with it). See the GNU C preprocessor manual for more information.A largely irrelevant aside: the "Best One Liner" winner of the 1987 International Obfuscated C Code Contest, by David Korn (yes, the author of the Korn Shell) took advantage of the predefined
unix
macro:It prints
"unix"
, but for reasons that have absolutely nothing to do with the spelling of the macro name.Because
linux
is a built-in macro defined when the compiler is running on, or compiling for (if it is a cross-compiler), Linux.There are a lot of such predefined macros. With GCC, you can use:
to get a list of macros. (I've not managed to persuade GCC to accept
/dev/null
directly, but the empty file seems to work OK.) With GCC 4.8.1 running on Mac OS X 10.8.5, I got the output:That's 236 macros from an empty file. When I added
#include <stdio.h>
to the file, the number of macros defined went up to 505. These includes all sorts of platform-identifying macros.