More specifically, a class, inheriting from an empty class, containing just a union whose members include an instance of the base data-less class, takes up more memory than just the union. Why does this happen and is there any way to avoid spending the extra memory?
The following code illustrates my question:
#include <iostream>
class empty_class { };
struct big : public empty_class
{
union
{
int data[3];
empty_class a;
};
};
struct small
{
union
{
int data[3];
empty_class a;
};
};
int main()
{
std::cout << sizeof(empty_class) << std::endl;
std::cout << sizeof(big) << std::endl;
std::cout << sizeof(small) << std::endl;
}
The output of this code, when compiled using gcc version 7.3.0 compiled with -std=c++17
(although, I get the same result using c++11 and c++14), is:
1
16
12
I would expect that the classes big and small should be of the same size; however strangely, big takes up more memory than small even though they both, seemingly, contain the same data.
Also even if the size of the array in the union is changed, the difference between the size of big and small is a constant 4 bytes.
-Edit:
It seems as though this behavior is not specific to classes with union data types. Similar behavior occurs in other similar situations where a derived class has a member with the base class type. Thanks to those who pointed this out.
This is because of what I call the "unique identity rule" of C++. Every (live) object in C++ of a particular type
T
must always have a different address from every other live object of typeT
. The compiler cannot provide a layout for a type where this rule would be violated, where two distinct subobjects with the same typeT
would have the same offset in the layout of their containing object.Class
big
contains two subobjects of note: a base classempty_class
and an anonymous union containing a memberempty_class
.The empty base optimization is based on aliasing the "storage" for an empty base class with other types. Typically, this is done by giving it the same address as the parent class, which means the address will typically be the same as the first non-empty base or first member subobject.
If the compiler gave the base class
empty_class
the same address as the union member, then you would have two distinct subobjects of the class (big::empty_class
andbig::a
) which have the same address but are different objects.Such a layout would violate the unique identity rule. And therefore, the compiler cannot employ the empty base optimization here. That's also why
big
is not standard layout.The
union
is a red herring here.If you simplify to
then
sizeof(big)
must be larger thansizeof(empty)
. This is because there are two objects of typeempty
inbig
and they therefore require different addresses. The empty base optimisation cannot be applied here, as it could be forwhere you could expect
sizeof(small)
to besizeof(int)
.