static initialization order fiasco

2019-01-01 04:35发布

问题:

I was reading about SIOF from a book and it gave an example :

//file1.cpp
extern int y;
int x=y+1;

//file2.cpp
extern int x;
int y=x+1;  

Now My question is :
In above code, will following things happen ?

  1. while compiling file1.cpp, compiler leaves y as it is i.e doesn\'t allocate storage for it.
  2. compiler allocates storage for x, but doesn\'t initialize it.
  3. While compiling file2.cpp, compiler leaves x as it is i.e doesn\'t allocate storage for it.
  4. compiler allocates storage for y, but doesn\'t initialize it.
  5. While linking file1.o and file2.o, now let file2.o is initialized first, so now:
    Does x gets initial value of 0? or doesn\'t get initialized?

回答1:

The initialization steps are given in 3.6.2 \"Initialization of non-local objects\" of the C++ standard:

Step 1: x and y are zero-initialized before any other initialization takes place.

Step 2: x or y is dynamically initialized - which one is unspecified by the standard. That variable will get the value 1 since the other variable will have been zero-initialized.

Step 3: the other variable will be dynamically initialized, getting the value 2.



回答2:

SIOF is very much a runtime artifact, the compiler and linker don\'t have much to do with it. Consider the atexit() function, it registers functions to be called at program exit. Many CRT implementations have something similar for program initialization, let\'s call it atinit().

Initializing these global variables requires executing code, the value cannot be determined by the compiler. So the compiler generates snippets of machine code that execute the expression and assigns the value. These snippets need to be executed before main() runs.

That\'s where atinit() comes into play. A common CRT implementation walks a list of atinit function pointers and execute the initialization snippets, in order. The problem is the order in which the functions are registered in the atinit() list. While atexit() has a well defined LIFO order, and it is implicitly determined by the order in which the code calls atexit(), such is not the case for atinit functions. The language specification doesn\'t require an order, there is nothing you could do in your code to specify an order. SIOF is the result.

One possible implementation is the compiler emitting function pointers in a separate section. The linker merges them, producing the atinit list. If your compiler does that then the initialization order will be determined by the order in which you link the object files. Look at the map file, you should see the atinit section if your compiler does this. It won\'t be called atinit, but some kind of name with \"init\" is likely. Taking a look at the CRT source code that calls main() should give insight as well.



回答3:

The whole point (and the reason it\'s called a \"fiasco\") is that it\'s impossible to say with any certainty what will happen in a case like this. Essentially, you\'re asking for something impossible (that two variables each be one greater than the other). Since they can\'t do that, what they will do is open to some question -- they might produce 0/1, or 1/0, or 1/2, or 2/1, or possibly (best case) just an error message.



回答4:

It is compiler dependent and may be runtime dependent. A compiler may decide to lazily initialize static variables when the first variable in a file is accessed, or as each variable is accessed. Otherwise it will initialize all static variables by file at launch time, with the order usually depending on the link order of files. The file order could change based on dependencies or other, compiler dependent influences.

Static variables are usually initialized to zero unless they have a constant initializer. Again, this is compiler dependent. So one of these variables will probably be zero when the other is initialized. However, since both have initializers some compilers might leave the values undefined.

I think the most likely scenario would be:

  1. Space is allocated for the variables, and both have the value 0.
  2. One variable, say x, is initialized and set to the value 1.
  3. The other, say y, is initialized and set to the value 2.

You could always run it and see. It could be that some compilers would generate code that goes into an infinite loop.