void a() { ... }
void b() { ... }
struct X
{
X() { b(); }
};
void f()
{
a();
static X x;
...
}
Assume f is called multiple times from various threads (potentially contended) after the entry of main. (and of course that the only calls to a and b are those seen above)
When the above code is compiled with gcc g++ 4.6 in -std=gnu++0x mode:
Q1. Is it guaranteed that a() will be called at least once and return before b() is called? That is to ask, on the first call to f(), is the constructor of x called at the same time an automatic duration local variable (non-static) would be (and not at global static initialization time for example)?
Q2. Is it guaranteed that b() will be called exactly once? Even if two threads execute f for the first time at the same time on different cores? If yes, by which specific mechanism does the GCC generated code provide synchronization? Edit: Additionally could one of the threads calling f() obtain access to x before the constructor of X returns?
Update: I am trying to compile an example and decompile to investigate mechanism...
test.cpp:
struct X;
void ext1(int x);
void ext2(X& x);
void a() { ext1(1); }
void b() { ext1(2); }
struct X
{
X() { b(); }
};
void f()
{
a();
static X x;
ext2(x);
}
Then:
$ g++ -std=gnu++0x -c -o test.o ./test.cpp
$ objdump -d test.o -M intel > test.dump
test.dump:
test.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <_Z1av>:
0: 55 push rbp
1: 48 89 e5 mov rbp,rsp
4: bf 01 00 00 00 mov edi,0x1
9: e8 00 00 00 00 call e <_Z1av+0xe>
e: 5d pop rbp
f: c3 ret
0000000000000010 <_Z1bv>:
10: 55 push rbp
11: 48 89 e5 mov rbp,rsp
14: bf 02 00 00 00 mov edi,0x2
19: e8 00 00 00 00 call 1e <_Z1bv+0xe>
1e: 5d pop rbp
1f: c3 ret
0000000000000020 <_Z1fv>:
20: 55 push rbp
21: 48 89 e5 mov rbp,rsp
24: 41 54 push r12
26: 53 push rbx
27: e8 00 00 00 00 call 2c <_Z1fv+0xc>
2c: b8 00 00 00 00 mov eax,0x0
31: 0f b6 00 movzx eax,BYTE PTR [rax]
34: 84 c0 test al,al
36: 75 2d jne 65 <_Z1fv+0x45>
38: bf 00 00 00 00 mov edi,0x0
3d: e8 00 00 00 00 call 42 <_Z1fv+0x22>
42: 85 c0 test eax,eax
44: 0f 95 c0 setne al
47: 84 c0 test al,al
49: 74 1a je 65 <_Z1fv+0x45>
4b: 41 bc 00 00 00 00 mov r12d,0x0
51: bf 00 00 00 00 mov edi,0x0
56: e8 00 00 00 00 call 5b <_Z1fv+0x3b>
5b: bf 00 00 00 00 mov edi,0x0
60: e8 00 00 00 00 call 65 <_Z1fv+0x45>
65: bf 00 00 00 00 mov edi,0x0
6a: e8 00 00 00 00 call 6f <_Z1fv+0x4f>
6f: 5b pop rbx
70: 41 5c pop r12
72: 5d pop rbp
73: c3 ret
74: 48 89 c3 mov rbx,rax
77: 45 84 e4 test r12b,r12b
7a: 75 0a jne 86 <_Z1fv+0x66>
7c: bf 00 00 00 00 mov edi,0x0
81: e8 00 00 00 00 call 86 <_Z1fv+0x66>
86: 48 89 d8 mov rax,rbx
89: 48 89 c7 mov rdi,rax
8c: e8 00 00 00 00 call 91 <_Z1fv+0x71>
Disassembly of section .text._ZN1XC2Ev:
0000000000000000 <_ZN1XC1Ev>:
0: 55 push rbp
1: 48 89 e5 mov rbp,rsp
4: 48 83 ec 10 sub rsp,0x10
8: 48 89 7d f8 mov QWORD PTR [rbp-0x8],rdi
c: e8 00 00 00 00 call 11 <_ZN1XC1Ev+0x11>
11: c9 leave
12: c3 ret
I don't see the synchronization mechanism? Or is it added at linktime?
Update2: Ok when I link it I can see it...
400973: 84 c0 test %al,%al
400975: 75 2d jne 4009a4 <_Z1fv+0x45>
400977: bf 98 20 40 00 mov $0x402098,%edi
40097c: e8 1f fe ff ff callq 4007a0 <__cxa_guard_acquire@plt>
400981: 85 c0 test %eax,%eax
400983: 0f 95 c0 setne %al
400986: 84 c0 test %al,%al
400988: 74 1a je 4009a4 <_Z1fv+0x45>
40098a: 41 bc 00 00 00 00 mov $0x0,%r12d
400990: bf a0 20 40 00 mov $0x4020a0,%edi
400995: e8 a6 00 00 00 callq 400a40 <_ZN1XC1Ev>
40099a: bf 98 20 40 00 mov $0x402098,%edi
40099f: e8 0c fe ff ff callq 4007b0 <__cxa_guard_release@plt>
4009a4: bf a0 20 40 00 mov $0x4020a0,%edi
4009a9: e8 72 ff ff ff callq 400920 <_Z4ext2R1X>
4009ae: 5b pop %rbx
4009af: 41 5c pop %r12
4009b1: 5d pop %rbp
It surrounds it with __cxa_guard_acquire and __cxa_guard_release, whatever they do.
Q1. Yes. According to C++11, 6.7/4:
so it will be initialised after the first call to
a()
.Q2. Under GCC, and any compiler that supports the C++11 thread model: yes, initialisation of local static variables is thread safe. Other compilers might not give that guarantee. The exact mechanism is an implementation detail. I believe GCC uses an atomic flag to indicate whether it's initialised, and a mutex to protect initialisation when the flag is not set, but I could be wrong. Certainly, this thread implies that it was originally implemented like that.
UPDATE: your code does indeed contain the initialisation code. You can see it more clearly if you link it, and then disassemble the program, so that you can see which functions are being called. I also used
objdump -SC
to interleave the source and demangle C++ names. It uses internal locking functions__cxa_guard_acquire
and__cxa_guard_release
, to make sure only one thread executes the initialisation code.As far as I know it is guaranteed that b is only called once. However, it is not guaranteed that the initialisation is performed thread safe, which means another thread could potentially work with a half/not initialized x. (That's kind of funny because static mutexes are basicly useless this way.)