David Hollman recently tweeted the following example (which I've slightly reduced):
struct FooBeforeBase {
double d;
bool b[4];
};
struct FooBefore : FooBeforeBase {
float value;
};
static_assert(sizeof(FooBefore) > 16);
//----------------------------------------------------
struct FooAfterBase {
protected:
double d;
public:
bool b[4];
};
struct FooAfter : FooAfterBase {
float value;
};
static_assert(sizeof(FooAfter) == 16);
You can examine the layout in clang on godbolt and see that the reason the size changed is that in FooBefore
, the member value
is placed at offset 16 (maintaining a full alignment of 8 from FooBeforeBase
) whereas in FooAfter
, the member value
is placed at offset 12 (effectively using FooAfterBase
's tail-padding).
It is clear to me that FooBeforeBase
is standard-layout, but FooAfterBase
is not (because its non-static data members do not all have the same access control, [class.prop]/3). But what is it about FooBeforeBase
's being standard-layout that requires this respect of padding bytes?
Both gcc and clang reuse FooAfterBase
's padding, ending up with sizeof(FooAfter) == 16
. But MSVC does not, ending up with 24. Is there a required layout per the standard and, if not, why do gcc and clang do what they do?
There is some confusion, so just to clear up:
FooBeforeBase
is standard-layoutFooBefore
is not (both it and a base class have non-static data members, similar toE
in this example)FooAfterBase
is not (it has non-static data members of differing access)FooAfter
is not (for both of the above reasons)
The answer to this question doesn't come from the standard but rather from the Itanium ABI (which is why gcc and clang have one behavior but msvc does something else). That ABI defines a layout, the relevant parts of which for the purposes of this question are:
and
Where the placement of members other than virtual base classes is defined as:
The term POD has disappeared from the C++ standard, but it means standard-layout and trivially copyable. In this question,
FooBeforeBase
is a POD. The Itanium ABI ignores tail padding - hencedsize(FooBeforeBase)
is 16.But
FooAfterBase
is not a POD (it is trivially copyable, but it is not standard-layout). As a result, tail padding is not ignored, sodsize(FooAfterBase)
is just 12, and thefloat
can go right there.This has interesting consequences, as pointed out by Quuxplusone in a related answer, implementors also typically assume that tail padding isn't reused, which wreaks havoc on this example:
Here,
=
does the right thing (it does not overrideB
's tail padding), butcopy()
has a library optimization that reduces tomemmove()
- which does not care about tail padding because it assumes it does not exist.Here's a similar case as n.m.'s answer.
First, let's have a function, which clears a
FooBeforeBase
:This is fine, as
clearBase
gets a pointer toFooBeforeBase
, it thinks that asFooBeforeBase
has standard-layout, so memsetting it is safe.Now, if you do this:
You don't expect, that
clearBase
will clearb.value
, asb.value
is not part ofFooBeforeBase
. But, ifFooBefore::value
was put into tail-padding ofFooBeforeBase
, it would been cleared as well.No, tail-padding is not required. It is an optimization, which gcc and clang do.
Here is a concrete case which demonstrates why the second case cannot reuse the padding:
this cannot clear
bob.b.value
.this is undefined behavior.
If the additional data member was placed in the hole,
memcpy
would have overwritten it.As is correctly pointed out in comments, the standard doesn't require that this
memcpy
invocation should work. However the Itanium ABI is seemingly designed with this case in mind. Perhaps the ABI rules are specified this way in order to make mixed-language programming a bit more robust, or to preserve some kind of backwards compatibility.Relevant ABI rules can be found here.
A related answer can be found here (this question might be a duplicate of that one).
FooBefore
is not std-layout either; two classes are declaring none-static data members(FooBefore
andFooBeforeBase
). Thus the compiler is allowed to arbitrarily place some data members. Hence the differences on different tool-chains arise. In a std-layout hierarchy, atmost one class(either the most derived class or at most one intermediate class) shall declare none-static data members.