I know that global variables in C sometimes have the extern
keyword. What is an extern
variable? What is the declaration like? What is its scope?
This is related to sharing variables across source files, but how does that work precisely? Where do I use extern
?
With xc8 you have to be careful about declaring a variable as the same type in each file as you could , erroneously, declare something an
int
in one file and achar
say in another. This could lead to corruption of variables.This problem was elegantly solved in a microchip forum some 15 years ago /* See "http:www.htsoft.com" / / "forum/all/showflat.php/Cat/0/Number/18766/an/0/page/0#18766"
But this link seems to no longer work...
So I;ll quickly try to explain it; make a file called global.h.
In it declare the following
Now in the file main.c
This means in main.c the variable will be declared as an
unsigned char
.Now in other files simply including global.h will have it declared as an extern for that file.
But it will be correctly declared as an
unsigned char
.The old forum post probably explained this a bit more clearly. But this is a real potential
gotcha
when using a compiler that allows you to declare a variable in one file and then declare it extern as a different type in another. The problems associated with that are if you say declared testing_mode as an int in another file it would think it was a 16 bit var and overwrite some other part of ram, potentially corrupting another variable. Difficult to debug!First off, the
extern
keyword is not used for defining a variable; rather it is used for declaring a variable. I can sayextern
is a storage class, not a data type.extern
is used to let other C files or external components know this variable is already defined somewhere. Example: if you are building a library, no need to define global variable mandatorily somewhere in library itself. The library will be compiled directly, but while linking the file, it checks for the definition.extern
is used so onefirst.c
file can have full access to a global parameter in anothersecond.c
file.The
extern
can be declared in thefirst.c
file or in any of the header filesfirst.c
includes.extern
allows one module of your program to access a global variable or function declared in another module of your program. You usually have extern variables declared in header files.If you don't want a program to access your variables or functions, you use
static
which tells the compiler that this variable or function cannot be used outside of this module.Using
extern
is only of relevance when the program you're building consists of multiple source files linked together, where some of the variables defined, for example, in source filefile1.c
need to be referenced in other source files, such asfile2.c
.It is important to understand the difference between defining a variable and declaring a variable:
You may declare a variable multiple times (though once is sufficient); you may only define it once within a given scope. A variable definition is also a declaration, but not all variable declarations are definitions.
Best way to declare and define global variables
The clean, reliable way to declare and define global variables is to use a header file to contain an
extern
declaration of the variable.The header is included by the one source file that defines the variable and by all the source files that reference the variable. For each program, one source file (and only one source file) defines the variable. Similarly, one header file (and only one header file) should declare the variable. The header file is crucial; it enables cross-checking between independent TUs (translation units — think source files) and ensures consistency.
Although there are other ways of doing it, this method is simple and reliable. It is demonstrated by
file3.h
,file1.c
andfile2.c
:file3.h
file1.c
file2.c
That's the best way to declare and define global variables.
The next two files complete the source for
prog1
:The complete programs shown use functions, so function declarations have crept in. Both C99 and C11 require functions to be declared or defined before they are used (whereas C90 did not, for good reasons). I use the keyword
extern
in front of function declarations in headers for consistency — to match theextern
in front of variable declarations in headers. Many people prefer not to useextern
in front of function declarations; the compiler doesn't care — and ultimately, neither do I as long as you're consistent, at least within a source file.prog1.h
prog1.c
prog1
usesprog1.c
,file1.c
,file2.c
,file3.h
andprog1.h
.The file
prog1.mk
is a makefile forprog1
only. It will work with most versions ofmake
produced since about the turn of the millennium. It is not tied specifically to GNU Make.prog1.mk
Guidelines
Rules to be broken by experts only, and only with good reason:
extern
declarations of variables — neverstatic
or unqualified variable definitions.extern
declarations of variables — source files always include the (sole) header that declares them.extern
.The source code and text of this answer are available in my SOQ (Stack Overflow Questions) repository on GitHub in the src/so-0143-3204 sub-directory.
If you're not an experienced C programmer, you could (and perhaps should) stop reading here.
Not so good way to define global variables
With some (indeed, many) C compilers, you can get away with what's called a 'common' definition of a variable too. 'Common', here, refers to a technique used in Fortran for sharing variables between source files, using a (possibly named) COMMON block. What happens here is that each of a number of files provides a tentative definition of the variable. As long as no more than one file provides an initialized definition, then the various files end up sharing a common single definition of the variable:
file10.c
file11.c
file12.c
This technique does not conform to the letter of the C standard and the 'one definition rule' — it is officially undefined behaviour:
However, the C standard also lists it in informative Annex J as one of the Common extensions.
Because this technique is not always supported, it is best to avoid using it, especially if your code needs to be portable. Using this technique, you can also end up with unintentional type punning. If one of the files declared
i
as adouble
instead of as anint
, C's type-unsafe linkers probably would not spot the mismatch. If you're on a machine with 64-bitint
anddouble
, you'd not even get a warning; on a machine with 32-bitint
and 64-bitdouble
, you'd probably get a warning about the different sizes — the linker would use the largest size, exactly as a Fortran program would take the largest size of any common blocks.The next two files complete the source for
prog2
:prog2.h
prog2.c
prog2
usesprog2.c
,file10.c
,file11.c
,file12.c
,prog2.h
.Warning
As noted in comments here, and as stated in my answer to a similar question, using multiple definitions for a global variable leads to undefined behaviour (J.2; §6.9), which is the standard's way of saying "anything could happen". One of the things that can happen is that the program behaves as you expect; and J.5.11 says, approximately, "you might be lucky more often than you deserve". But a program that relies on multiple definitions of an extern variable — with or without the explicit 'extern' keyword — is not a strictly conforming program and not guaranteed to work everywhere. Equivalently: it contains a bug which may or may not show itself.
Violating the guidelines
There are, of course, many ways in which these guidelines can be broken. Occasionally, there may be a good reason to break the guidelines, but such occasions are extremely unusual.
faulty_header.h
Note 1: if the header defines the variable without the
extern
keyword, then each file that includes the header creates a tentative definition of the variable. As noted previously, this will often work, but the C standard does not guarantee that it will work.broken_header.h
Note 2: if the header defines and initializes the variable, then only one source file in a given program can use the header. Since headers are primarily for sharing information, it is a bit silly to create one that can only be used once.
seldom_correct.h
Note 3: if the header defines a static variable (with or without initialization), then each source file ends up with its own private version of the 'global' variable.
If the variable is actually a complex array, for example, this can lead to extreme duplication of code. It can, very occasionally, be a sensible way to achieve some effect, but that is very unusual.
Summary
Use the header technique I showed first. It works reliably and everywhere. Note, in particular, that the header declaring the
global_variable
is included in every file that uses it — including the one that defines it. This ensures that everything is self-consistent.Similar concerns arise with declaring and defining functions — analogous rules apply. But the question was about variables specifically, so I've kept the answer to variables only.
End of Original Answer
If you're not an experienced C programmer, you probably should stop reading here.
Late Major Addition
Avoiding Code Duplication
One concern that is sometimes (and legitimately) raised about the 'declarations in headers, definitions in source' mechanism described here is that there are two files to be kept synchronized — the header and the source. This is usually followed up with an observation that a macro can be used so that the header serves double duty — normally declaring the variables, but when a specific macro is set before the header is included, it defines the variables instead.
Another concern can be that the variables need to be defined in each of a number of 'main programs'. This is normally a spurious concern; you can simply introduce a C source file to define the variables and link the object file produced with each of the programs.
A typical scheme works like this, using the original global variable illustrated in
file3.h
:file3a.h
file1a.c
file2a.c
The next two files complete the source for
prog3
:prog3.h
prog3.c
prog3
usesprog3.c
,file1a.c
,file2a.c
,file3a.h
,prog3.h
.Variable initialization
The problem with this scheme as shown is that it does not provide for initialization of the global variable. With C99 or C11 and variable argument lists for macros, you could define a macro to support initialization too. (With C89 and no support for variable argument lists in macros, there is no easy way to handle arbitrarily long initializers.)
file3b.h
Reverse contents of
#if
and#else
blocks, fixing bug identified by Denis Kniazhevfile1b.c
file2b.c
Clearly, the code for the oddball structure is not what you'd normally write, but it illustrates the point. The first argument to the second invocation of
INITIALIZER
is{ 41
and the remaining argument (singular in this example) is43 }
. Without C99 or similar support for variable argument lists for macros, initializers that need to contain commas are very problematic.Correct header
file3b.h
included (instead offileba.h
) per Denis KniazhevThe next two files complete the source for
prog4
:prog4.h
prog4.c
prog4
usesprog4.c
,file1b.c
,file2b.c
,prog4.h
,file3b.h
.Header Guards
Any header should be protected against reinclusion, so that type definitions (enum, struct or union types, or typedefs generally) do not cause problems. The standard technique is to wrap the body of the header in a header guard such as:
The header might be included twice indirectly. For example, if
file4b.h
includesfile3b.h
for a type definition that isn't shown, andfile1b.c
needs to use both headerfile4b.h
andfile3b.h
, then you have some more tricky issues to resolve. Clearly, you might revise the header list to include justfile4b.h
. However, you might not be aware of the internal dependencies — and the code should, ideally, continue to work.Further, it starts to get tricky because you might include
file4b.h
before includingfile3b.h
to generate the definitions, but the normal header guards onfile3b.h
would prevent the header being reincluded.So, you need to include the body of
file3b.h
at most once for declarations, and at most once for definitions, but you might need both in a single translation unit (TU — a combination of a source file and the headers it uses).Multiple inclusion with variable definitions
However, it can be done subject to a not too unreasonable constraint. Let's introduce a new set of file names:
external.h
for the EXTERN macro definitions, etc.file1c.h
to define types (notably,struct oddball
, the type ofoddball_struct
).file2c.h
to define or declare the global variables.file3c.c
which defines the global variables.file4c.c
which simply uses the global variables.file5c.c
which shows that you can declare and then define the global variables.file6c.c
which shows that you can define and then (attempt to) declare the global variables.In these examples,
file5c.c
andfile6c.c
directly include the headerfile2c.h
several times, but that is the simplest way to show that the mechanism works. It means that if the header was indirectly included twice, it would also be safe.The restrictions for this to work are:
external.h
file1c.h
file2c.h
file3c.c
file4c.c
file5c.c
file6c.c
The next source file completes the source (provides a main program) for
prog5
,prog6
andprog7
:prog5.c
prog5
usesprog5.c
,file3c.c
,file4c.c
,file1c.h
,file2c.h
,external.h
.prog6
usesprog5.c
,file5c.c
,file4c.c
,file1c.h
,file2c.h
,external.h
.prog7
usesprog5.c
,file6c.c
,file4c.c
,file1c.h
,file2c.h
,external.h
.This scheme avoids most problems. You only run into a problem if a header that defines variables (such as
file2c.h
) is included by another header (sayfile7c.h
) that defines variables. There isn't an easy way around that other than "don't do it".You can partially work around the problem by revising
file2c.h
intofile2d.h
:file2d.h
The issue becomes 'should the header include
#undef DEFINE_VARIABLES
?' If you omit that from the header and wrap any defining invocation with#define
and#undef
:in the source code (so the headers never alter the value of
DEFINE_VARIABLES
), then you should be clean. It is just a nuisance to have to remember to write the the extra line. An alternative might be:externdef.h
This is getting a tad convoluted, but seems to be secure (using the
file2d.h
, with no#undef DEFINE_VARIABLES
in thefile2d.h
).file7c.c
file8c.h
file8c.c
The next two files complete the source for
prog8
andprog9
:prog8.c
file9c.c
prog8
usesprog8.c
,file7c.c
,file9c.c
.prog9
usesprog8.c
,file8c.c
,file9c.c
.However, the problems are relatively unlikely to occur in practice, especially if you take the standard advice to
Avoid global variables
Does this exposition miss anything?
Confession: The 'avoiding duplicated code' scheme outlined here was developed because the issue affects some code I work on (but don't own), and is a niggling concern with the scheme outlined in the first part of the answer. However, the original scheme leaves you with just two places to modify to keep variable definitions and declarations synchronized, which is a big step forward over having exernal variable declarations scattered throughout the code base (which really matters when there are thousands of files in total). However, the code in the files with the names
fileNc.[ch]
(plusexternal.h
andexterndef.h
) shows that it can be made to work. Clearly, it would not be hard to create a header generator script to give you the standardized template for a variable defining and declaring header file.NB These are toy programs with just barely enough code to make them marginally interesting. There is repetition within the examples that could be removed, but isn't to simplify the pedagogical explanation. (For example: the difference between
prog5.c
andprog8.c
is the name of one of the headers that are included. It would be possible to reorganize the code so that themain()
function was not repeated, but it would conceal more than it revealed.)Adding an
extern
turns a variable definition into a variable declaration. See this thread as to what's the difference between a declaration and a definition.