In C++
I know static
and global
objects are constructed before the main
function. But as you know, in C
, there is no such kind initialization procedure
before main
.
For example, in my code:
int global_int1 = 5;
int global_int2;
static int static_int1 = 4;
static int static_int2;
- When are these four variables initialized?
- Where values for initialization like
5
and 4
are stored during compilation? How to manage them when initialization?
EDIT:
Clarification of 2nd question.
- In my code I use
5
to initialize global_int1
, so how can the compiler assign 5
to global_int
? For example, maybe the compiler first store the 5
value at somewhere (i.e. a table), and get this value when initialization begins.
- As to "How to manage them when initialization?", it is realy vague and I myself does not how to interpret yet. Sometimes, it is not easy to explain a question. Overlook it since I have not mastered the question fully yet.
By static and global objects, I presume you mean objects with
static lifetime defined at namespace scope. When such objects
are defined with local scope, the rules are slightly different.
Formally, C++ initializes such variables in three phases:
1. Zero initialization
2. Static initialization
3. Dynamic initialization
The language also distinguishes between variables which require
dynamic initialization, and those which require static
initialization: all static objects (objects with static
lifetime) are first zero initialized, then objects with static
initialization are initialized, and then dynamic initialization
occurs.
As a simple first approximation, dynamic initialization means
that some code must be executed; typically, static
initialization doesn't. Thus:
extern int f();
int g1 = 42; // static initialization
int g2 = f(); // dynamic initialization
Another approximization would be that static initialization is
what C supports (for variables with static lifetime), dynamic
everything else.
How the compiler does this depends, of course, on the
initialization, but on disk based systems, where the executable
is loaded into memory from disk, the values for static
initialization are part of the image on disk, and loaded
directly by the system from the disk. On a classical Unix
system, global variables would be divided into three "segments":
- text:
-
The code, loaded into a write protected area. Static
variables with `const` types would also be placed here.
- data:
-
Static variables with static initializers.
- bss:
-
Static variables with no-initializer (C and C++) or with dynamic
initialization (C++). The executable contains no image for this
segment, and the system simply sets it all to `0` before
starting your code.
I suspect that a lot of modern systems still use something
similar.
EDIT:
One additional remark: the above refers to C++03. For existing
programs, C++11 probably doesn't change anything, but it does
add constexpr
(which means that some user defined functions
can still be static initialization) and thread local variables,
which opens up a whole new can of worms.
Preface: The word "static" has a vast number of different meanings in C++. Don't get confused.
All your objects have static storage duration. That is because they are neither automatic nor dynamic. (Nor thread-local, though thread-local is a bit like static.)
In C++, Static objects are initialized in two phases: static initialization, and dynamic initialization.
Dynamic initialization requires actual code to execute, so this happens for objects that start with a constructor call, or where the initializer is an expression that can only be evaluated at runtime.
Static initialization is when the initializer is known statically and no constructor needs to run. (Static initialization is either zero-initialization or constant-initialization.) This is the case for your int
variables with constant initializer, and you are guaranteed that those are indeed initialized in the static phase.
(Static-storage variables with dynamic initialization are also zero-initialzed statically before anything else happens.)
The crucial point is that the static initialization phase doens't "run" at all. The data is there right from the start. That means that there is no "ordering" or any other such dynamic property that concerns static initialization. The initial values are hard-coded into your program binary, if you will.
When are these four variables initialized?
As you say, this happens before program startup, i.e. before main
begins. C does not specify it further; in C++, these happen during the static initialisation phase before objects with more complicated constructors or initialisers.
Where values for initialization like 5 and 4 are stored during compilation?
Typically, the non-zero values are stored in a data segment in the program file, while the zero values are in a bss segment which just reserves enough memory for the variables. When the program starts, the data segment is loaded into memory and the bss segment is set to zero. (Of course, the language standard doesn't specify this, so a compiler could do something else, like generate code to initialise each variables before running main
).
Paraphrased from the standard:
All variables which do not have dynamic storage duration, do not have thread local storage duration, and are not local, have static storage duration. In other words, all globals have static storage duration.
Static objects with dynamic initialization are not necessarily created before the first statement in the main function. It is implementation defined as to whether these objects are created before the first statement in main, or before the first use of any function or variable defined in the same translation unit as the static variable to be initialized.
So, in your code, global_int1 and static_int1 are definitely initialized before the first statement in main because they are statically initialized. However, global_int2 and static_int2 are dynamically initialized, so their initialization is implementation defined according to the rule I mentioned above.
As for your second point, I'm not sure I understand what you mean. Could you clarify?