This FAQ is about Aggregates and PODs and covers the following material:
- What are Aggregates?
- What are PODs (Plain Old Data)?
- How are they related?
- How and why are they special?
- What changes for C++11?
This FAQ is about Aggregates and PODs and covers the following material:
What changes for C++11?
Aggregates
The standard definition of an aggregate has changed slightly, but it's still pretty much the same:
Ok, what changed?
Previously, an aggregate could have no user-declared constructors, but now it can't have user-provided constructors. Is there a difference? Yes, there is, because now you can declare constructors and default them:
This is still an aggregate because a constructor (or any special member function) that is defaulted on the first declaration is not user-provided.
Now an aggregate cannot have any brace-or-equal-initializers for non-static data members. What does this mean? Well, this is just because with this new standard, we can initialize members directly in the class like this:
Using this feature makes the class no longer an aggregate because it's basically equivalent to providing your own default constructor.
So, what is an aggregate didn't change much at all. It's still the same basic idea, adapted to the new features.
What about PODs?
PODs went through a lot of changes. Lots of previous rules about PODs were relaxed in this new standard, and the way the definition is provided in the standard was radically changed.
The idea of a POD is to capture basically two distinct properties:
Because of this, the definition has been split into two distinct concepts: trivial classes and standard-layout classes, because these are more useful than POD. The standard now rarely uses the term POD, preferring the more specific trivial and standard-layout concepts.
The new definition basically says that a POD is a class that is both trivial and has standard-layout, and this property must hold recursively for all non-static data members:
Let's go over each of these two properties in detail separately.
Trivial classes
Trivial is the first property mentioned above: trivial classes support static initialization. If a class is trivially copyable (a superset of trivial classes), it is ok to copy its representation over the place with things like
memcpy
and expect the result to be the same.The standard defines a trivial class as follows:
So, what are all those trivial and non-trivial things?
Basically this means that a copy or move constructor is trivial if it is not user-provided, the class has nothing virtual in it, and this property holds recursively for all the members of the class and for the base class.
The definition of a trivial copy/move assignment operator is very similar, simply replacing the word "constructor" with "assignment operator".
A trivial destructor also has a similar definition, with the added constraint that it can't be virtual.
And yet another similar rule exists for trivial default constructors, with the addition that a default constructor is not-trivial if the class has non-static data members with brace-or-equal-initializers, which we've seen above.
Here are some examples to clear everything up:
Standard-layout
Standard-layout is the second property. The standard mentions that these are useful for communicating with other languages, and that's because a standard-layout class has the same memory layout of the equivalent C struct or union.
This is another property that must hold recursively for members and all base classes. And as usual, no virtual functions or virtual base classes are allowed. That would make the layout incompatible with C.
A relaxed rule here is that standard-layout classes must have all non-static data members with the same access control. Previously these had to be all public, but now you can make them private or protected, as long as they are all private or all protected.
When using inheritance, only one class in the whole inheritance tree can have non-static data members, and the first non-static data member cannot be of a base class type (this could break aliasing rules), otherwise, it's not a standard-layout class.
This is how the definition goes in the standard text:
And let's see a few examples.
Conclusion
With these new rules a lot more types can be PODs now. And even if a type is not POD, we can take advantage of some of the POD properties separately (if it is only one of trivial or standard-layout).
The standard library has traits to test these properties in the header
<type_traits>
:How to read:
This article is rather long. If you want to know about both aggregates and PODs (Plain Old Data) take time and read it. If you are interested just in aggregates, read only the first part. If you are interested only in PODs then you must first read the definition, implications, and examples of aggregates and then you may jump to PODs but I would still recommend reading the first part in its entirety. The notion of aggregates is essential for defining PODs. If you find any errors (even minor, including grammar, stylistics, formatting, syntax, etc.) please leave a comment, I'll edit.
What are aggregates and why they are special
Formal definition from the C++ standard (C++03 8.5.1 §1):
So, OK, let's parse this definition. First of all, any array is an aggregate. A class can also be an aggregate if… wait! nothing is said about structs or unions, can't they be aggregates? Yes, they can. In C++, the term
class
refers to all classes, structs, and unions. So, a class (or struct, or union) is an aggregate if and only if it satisfies the criteria from the above definitions. What do these criteria imply?This does not mean an aggregate class cannot have constructors, in fact it can have a default constructor and/or a copy constructor as long as they are implicitly declared by the compiler, and not explicitly by the user
No private or protected non-static data members. You can have as many private and protected member functions (but not constructors) as well as as many private or protected static data members and member functions as you like and not violate the rules for aggregate classes
An aggregate class can have a user-declared/user-defined copy-assignment operator and/or destructor
An array is an aggregate even if it is an array of non-aggregate class type.
Now let's look at some examples:
You get the idea. Now let's see how aggregates are special. They, unlike non-aggregate classes, can be initialized with curly braces
{}
. This initialization syntax is commonly known for arrays, and we just learnt that these are aggregates. So, let's start with them.Type array_name[n] = {a1, a2, …, am};
if(m == n)
the ith element of the array is initialized with ai
else if(m < n)
the first m elements of the array are initialized with a1, a2, …, am and the other
n - m
elements are, if possible, value-initialized (see below for the explanation of the term)else if(m > n)
the compiler will issue an error
else (this is the case when n isn't specified at all like
int a[] = {1, 2, 3};
)the size of the array (n) is assumed to be equal to m, so
int a[] = {1, 2, 3};
is equivalent toint a[3] = {1, 2, 3};
When an object of scalar type (
bool
,int
,char
,double
, pointers, etc.) is value-initialized it means it is initialized with0
for that type (false
forbool
,0.0
fordouble
, etc.). When an object of class type with a user-declared default constructor is value-initialized its default constructor is called. If the default constructor is implicitly defined then all nonstatic members are recursively value-initialized. This definition is imprecise and a bit incorrect but it should give you the basic idea. A reference cannot be value-initialized. Value-initialization for a non-aggregate class can fail if, for example, the class has no appropriate default constructor.Examples of array initialization:
Now let's see how aggregate classes can be initialized with braces. Pretty much the same way. Instead of the array elements we will initialize the non-static data members in the order of their appearance in the class definition (they are all public by definition). If there are fewer initializers than members, the rest are value-initialized. If it is impossible to value-initialize one of the members which were not explicitly initialized, we get a compile-time error. If there are more initializers than necessary, we get a compile-time error as well.
In the above example
y.c
is initialized with'a'
,y.x.i1
with10
,y.x.i2
with20
,y.i[0]
with20
,y.i[1]
with30
andy.f
is value-initialized, that is, initialized with0.0
. The protected static memberd
is not initialized at all, because it isstatic
.Aggregate unions are different in that you may initialize only their first member with braces. I think that if you are advanced enough in C++ to even consider using unions (their use may be very dangerous and must be thought of carefully), you could look up the rules for unions in the standard yourself :).
Now that we know what's special about aggregates, let's try to understand the restrictions on classes; that is, why they are there. We should understand that memberwise initialization with braces implies that the class is nothing more than the sum of its members. If a user-defined constructor is present, it means that the user needs to do some extra work to initialize the members therefore brace initialization would be incorrect. If virtual functions are present, it means that the objects of this class have (on most implementations) a pointer to the so-called vtable of the class, which is set in the constructor, so brace-initialization would be insufficient. You could figure out the rest of the restrictions in a similar manner as an exercise :).
So enough about the aggregates. Now we can define a stricter set of types, to wit, PODs
What are PODs and why they are special
Formal definition from the C++ standard (C++03 9 §4):
Wow, this one's tougher to parse, isn't it? :) Let's leave unions out (on the same grounds as above) and rephrase in a bit clearer way:
What does this definition imply? (Did I mention POD stands for Plain Old Data?)
Examples:
POD-classes, POD-unions, scalar types, and arrays of such types are collectively called POD-types.
PODs are special in many ways. I'll provide just some examples.
POD-classes are the closest to C structs. Unlike them, PODs can have member functions and arbitrary static members, but neither of these two change the memory layout of the object. So if you want to write a more or less portable dynamic library that can be used from C and even .NET, you should try to make all your exported functions take and return only parameters of POD-types.
The lifetime of objects of non-POD class type begins when the constructor has finished and ends when the destructor has finished. For POD classes, the lifetime begins when storage for the object is occupied and finishes when that storage is released or reused.
For objects of POD types it is guaranteed by the standard that when you
memcpy
the contents of your object into an array of char or unsigned char, and thenmemcpy
the contents back into your object, the object will hold its original value. Do note that there is no such guarantee for objects of non-POD types. Also, you can safely copy POD objects withmemcpy
. The following example assumes T is a POD-type:goto statement. As you may know, it is illegal (the compiler should issue an error) to make a jump via goto from a point where some variable was not yet in scope to a point where it is already in scope. This restriction applies only if the variable is of non-POD type. In the following example
f()
is ill-formed whereasg()
is well-formed. Note that Microsoft's compiler is too liberal with this rule—it just issues a warning in both cases.It is guaranteed that there will be no padding in the beginning of a POD object. In other words, if a POD-class A's first member is of type T, you can safely
reinterpret_cast
fromA*
toT*
and get the pointer to the first member and vice versa.The list goes on and on…
Conclusion
It is important to understand what exactly a POD is because many language features, as you see, behave differently for them.
I'll try:
That's simple: all non-static data members must all be
public
,private
, orprotected
. You can't have somepublic
and someprivate
.The reasoning for them goes to the reasoning for having a distinction between "standard layout" and "not standard layout" at all. Namely, to give the compiler the freedom to choose how to put things into memory. It's not just about vtable pointers.
Back when they standardized C++ in 98, they had to basically predict how people would implement it. While they had quite a bit of implementation experience with various flavors of C++, they weren't certain about things. So they decided to be cautious: give the compilers as much freedom as possible.
That's why the definition of POD in C++98 is so strict. It gave C++ compilers great latitude on member layout for most classes. Basically, POD types were intended to be special cases, something you specifically wrote for a reason.
When C++11 was being worked on, they had a lot more experience with compilers. And they realized that... C++ compiler writers are really lazy. They had all this freedom, but they didn't do anything with it.
The rules of standard layout are more or less codifying common practice: most compilers didn't really have to change much if anything at all to implement them (outside of maybe some stuff for the corresponding type traits).
Now, when it came to
public
/private
, things are different. The freedom to reorder which members arepublic
vs.private
actually can matter to the compiler, particularly in debugging builds. And since the point of standard layout is that there is compatibility with other languages, you can't have the layout be different in debug vs. release.Then there's the fact that it doesn't really hurt the user. If you're making an encapsulated class, odds are good that all of your data members will be
private
anyway. You generally don't expose public data members on fully encapsulated types. So this would only be a problem for those few users who do want to do that, who want that division.So it's no big loss.
The reason for this one comes back to why they standardized standard layout again: common practice.
There's no common practice when it comes to having two members of an inheritance tree that actually store things. Some put the base class before the derived, others do it the other way. Which way do you order the members if they come from two base classes? And so on. Compilers diverge greatly on these questions.
Also, thanks to the zero/one/infinity rule, once you say you can have two classes with members, you can say as many as you want. This requires adding a lot of layout rules for how to handle this. You have to say how multiple inheritance works, which classes put their data before other classes, etc. That's a lot of rules, for very little material gain.
You can't make everything that doesn't have virtual functions and a default constructor standard layout.
I can't really speak to this one. I'm not educated enough in C++'s aliasing rules to really understand it. But it has something to do with the fact that the base member will share the same address as the base class itself. That is:
And that's probably against C++'s aliasing rules. In some way.
However, consider this: how useful could having the ability to do this ever actually be? Since only one class can have non-static data members, then
Derived
must be that class (since it has aBase
as a member). SoBase
must be empty (of data). And ifBase
is empty, as well as a base class... why have a data member of it at all?Since
Base
is empty, it has no state. So any non-static member functions will do what they do based on their parameters, not theirthis
pointer.So again: no big loss.
Changes in C++17
Download the C++17 International Standard final draft here.
Aggregates
C++17 expands and enhances aggregates and aggregate initialization. The standard library also now includes an
std::is_aggregate
type trait class. Here is the formal definition from section 11.6.1.1 and 11.6.1.2 (internal references elided):What changed?
Trivial Classes
The definition of trivial class was reworked in C++17 to address several defects that were not addressed in C++14. The changes were technical in nature. Here is the new definition at 12.0.6 (internal references elided):
Changes:
std::memcpy
. This was a semantic contradiction, because, by defining as deleted all constructor/assignment operators, the creator of the class clearly intended that the class could not be copied/moved, yet the class still met the definition of a trivially copyable class. Hence in C++17 we have a new clause stating that trivially copyable class must have at least one trivial, non-deleted (though not necessarily publicly accessible) copy/move constructor/assignment operator. See N4148, DR1734Standard-layout Classes
The definition of standard-layout was also reworked to address defect reports. Again the changes were technical in nature. Here is the text from the standard (12.0.7). As before, internal references are elided:
Changes:
Note: The C++ standards committee intended the above changes based on defect reports to apply to C++14, though the new language is not in the published C++14 standard. It is in the C++17 standard.
What will change for C++20
This is still early, so some of this answer may change in the future. Following the rest of the clear theme of this question, the meaning and use of aggregates continues to change with every standard. There are several key changes on the horizon.
Types with user-declared constructors P1008
In C++17, this type is still an aggregate:
And hence,
X{}
still compiles because that is aggregate initialization - not a constructor invocation. See also: When is a private constructor not a private constructor?In C++20, the restriction will change from requiring:
to
This has been adopted into the C++20 working draft. Neither the
X
here nor theC
in the linked question will be aggregates in C++20.Initializing aggregates from a parenthesized list of values P960
A common issue that comes up is wanting to use
emplace()
-style constructors with aggregates:This does not work, because
emplace
will try to effectively perform the initializationX(1, 2)
, which is not valid. The typical solution is to add a constructor toX
, but with this proposal (currently working its way through Core), aggregates will effectively have synthesized constructors which do the right thing - and behave like regular constructors. The above code will compile as-is in C++20 (assuming this feature gets approved, which seems likely).Class Template Argument Deduction (CTAD) for Aggregates P1021
In C++17, this does not compile:
Users would have to write their own deduction guide for all aggregate templates:
But as this is in some sense "the obvious thing" to do, and is basically just boilerplate, the language will do this for you. This change was approved by Evolution in November 2018, so the above example will likely compile in C++20 (without the need for the user-provided deduction guide).
What has changed for C++14
We can refer to the Draft C++14 standard for reference.
Aggregates
This is covered in section
8.5.1
Aggregates which gives us the following definition:The only change is now adding in-class member initializers does not make a class a non-aggregate. So the following example from C++11 aggregate initialization for classes with member in-pace initializers:
was not an aggregate in C++11 but it is in C++14. This change is covered in N3605: Member initializers and aggregates, which has the following abstract:
POD stays the same
The definition for POD(plain old data) struct is covered in section
9
Classes which says:which is the same wording as C++11.
Standard-Layout Changes for C++14
As noted in the comments pod relies on the definition of standard-layout and that did change for C++14 but this was via defect reports that were applied to C++14 after the fact.
There were three DRs:
So standard-layout went from this Pre C++14:
To this in C++14: