When are static and global variables initialized?

2019-01-06 12:20发布

问题:

In C++ I know static and global objects are constructed before the main function. But as you know, in C, there is no such kind initialization procedure before main.

For example, in my code:

int global_int1 = 5;
int global_int2;
static int static_int1 = 4;
static int static_int2;
  • When are these four variables initialized?
  • Where values for initialization like 5 and 4 are stored during compilation? How to manage them when initialization?

EDIT:
Clarification of 2nd question.

  • In my code I use 5 to initialize global_int1, so how can the compiler assign 5 to global_int? For example, maybe the compiler first store the 5 value at somewhere (i.e. a table), and get this value when initialization begins.
  • As to "How to manage them when initialization?", it is realy vague and I myself does not how to interpret yet. Sometimes, it is not easy to explain a question. Overlook it since I have not mastered the question fully yet.

回答1:

By static and global objects, I presume you mean objects with static lifetime defined at namespace scope. When such objects are defined with local scope, the rules are slightly different.

Formally, C++ initializes such variables in three phases: 1. Zero initialization 2. Static initialization 3. Dynamic initialization The language also distinguishes between variables which require dynamic initialization, and those which require static initialization: all static objects (objects with static lifetime) are first zero initialized, then objects with static initialization are initialized, and then dynamic initialization occurs.

As a simple first approximation, dynamic initialization means that some code must be executed; typically, static initialization doesn't. Thus:

extern int f();

int g1 = 42;    //  static initialization
int g2 = f();   //  dynamic initialization

Another approximization would be that static initialization is what C supports (for variables with static lifetime), dynamic everything else.

How the compiler does this depends, of course, on the initialization, but on disk based systems, where the executable is loaded into memory from disk, the values for static initialization are part of the image on disk, and loaded directly by the system from the disk. On a classical Unix system, global variables would be divided into three "segments":

text:
The code, loaded into a write protected area. Static variables with `const` types would also be placed here.
data:
Static variables with static initializers.
bss:
Static variables with no-initializer (C and C++) or with dynamic initialization (C++). The executable contains no image for this segment, and the system simply sets it all to `0` before starting your code.

I suspect that a lot of modern systems still use something similar.

EDIT:

One additional remark: the above refers to C++03. For existing programs, C++11 probably doesn't change anything, but it does add constexpr (which means that some user defined functions can still be static initialization) and thread local variables, which opens up a whole new can of worms.



回答2:

Preface: The word "static" has a vast number of different meanings in C++. Don't get confused.

All your objects have static storage duration. That is because they are neither automatic nor dynamic. (Nor thread-local, though thread-local is a bit like static.)

In C++, Static objects are initialized in two phases: static initialization, and dynamic initialization.

  • Dynamic initialization requires actual code to execute, so this happens for objects that start with a constructor call, or where the initializer is an expression that can only be evaluated at runtime.

  • Static initialization is when the initializer is known statically and no constructor needs to run. (Static initialization is either zero-initialization or constant-initialization.) This is the case for your int variables with constant initializer, and you are guaranteed that those are indeed initialized in the static phase.

  • (Static-storage variables with dynamic initialization are also zero-initialzed statically before anything else happens.)

The crucial point is that the static initialization phase doens't "run" at all. The data is there right from the start. That means that there is no "ordering" or any other such dynamic property that concerns static initialization. The initial values are hard-coded into your program binary, if you will.



回答3:

When are these four variables initialized?

As you say, this happens before program startup, i.e. before main begins. C does not specify it further; in C++, these happen during the static initialisation phase before objects with more complicated constructors or initialisers.

Where values for initialization like 5 and 4 are stored during compilation?

Typically, the non-zero values are stored in a data segment in the program file, while the zero values are in a bss segment which just reserves enough memory for the variables. When the program starts, the data segment is loaded into memory and the bss segment is set to zero. (Of course, the language standard doesn't specify this, so a compiler could do something else, like generate code to initialise each variables before running main).



回答4:

Paraphrased from the standard:

All variables which do not have dynamic storage duration, do not have thread local storage duration, and are not local, have static storage duration. In other words, all globals have static storage duration.

Static objects with dynamic initialization are not necessarily created before the first statement in the main function. It is implementation defined as to whether these objects are created before the first statement in main, or before the first use of any function or variable defined in the same translation unit as the static variable to be initialized.

So, in your code, global_int1 and static_int1 are definitely initialized before the first statement in main because they are statically initialized. However, global_int2 and static_int2 are dynamically initialized, so their initialization is implementation defined according to the rule I mentioned above.

As for your second point, I'm not sure I understand what you mean. Could you clarify?