In the c++ standard, in [basic.lval]/11.6 says:
If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undefined:[...]
- an aggregate or union type that includes one of the aforementioned types among its elements or non-static data members (including, recursively, an element or non-static data member of a subaggregate or contained union),[...]
This sentence is part of the strict-aliasing rule.
Can it allow us to access the inactive member of a non existing union? As in:
struct A{
int id :1;
int value :32;
};
struct Id{
int id :1;
};
union X{
A a;
Id id_;
};
void test(){
A a;
auto id = reinterpret_cast<X&>(a).id_; //UB or not?
}
Note: Bellow an explanation of what I do not grasp in the standard, and why the example above could be useful.
I wonder in what could [basic.lval]/11.6 be usefull.
[class.mfct.non-static]/2 does forbid us to call a member function of the "casted to" union or aggregate:
If a non-static member function of a class X is called for an object that is not of type X, or of a type derived from X, the behavior is undefined.
Considering that static data member access, or static member function can directly be performed using a qualified-name (a_class::a_static_member
),
the only usefull uses case of the [basic.lval]/11.6, may be to access member of the "casted to" union. I thought about using this last standard rule to implement an "optimized variant". This variant could hold either a class A object or a class B object, the two starting with a bitfield of size 1, denoting the type:
class A{
unsigned type_id_ :1;
int value :31;
public:
A():type_id_{0}{}
void bar{};
void baz{};
};
class B{
unsigned type_id_ :1;
int value :31;
public:
B():type_id_{1}{}
int value() const;
void value(int);
void bar{};
void baz{};
};
struct type_id_t{
unsigned type_id_ :1;
};
struct AB_variant{
union {
A a;
B b;
type_id_t id;};
//[...]
static void foo(AB_variant& x){
if (x.id.type_id_==0){
reinterpret_cast<A&>(x).bar();
reinterpret_cast<A&>(x).baz();
}
else if (x.id.type_id_==1){
reinterpret_cast<B&>(x).bar();
reinterpret_cast<B&>(x).baz();
}
}
};
The call to AB_variant::foo
does not invoke undefined behavior as long as its argument refers to an object of type AB_variant
thanks to the rule of pointer-interconvertibility [basic.compound]/4. The access to the inactive union member type_id_
is allowed because id
belongs to the common initial sequence of A
, B
and type_id_t
[class.mem]/25:
But what happens if I try to call it with a complete object of type A
?
A a{};
AB_variant::foo(reinterpret_cast<AB_variant&>(a));
The problem here is that I try to access an inactive member of a union that does not exist.
The two pertinent standard paragraphs are [class.mem]/25:
In a standard-layout union with an active member of struct type T1, it is permitted to read a non-static data member m of another union member of struct type T2 provided m is part of the common initial sequence of T1 and T2; the behavior is as if the corresponding member of T1 were nominated.
And [class.union]/1:
In a union, a non-static data member is active if its name refers to an object whose lifetime has begun and has not ended.
Q3: Does the expression "its name refers" signify that "an object" is actually an object built within a living union? Or could it refers to object a
because of [basic.lval]/11.6.
There are many situations, especially involving type punning and unions, where one part of the C or C++ Standard describes the behavior of some action, another part describes an overlapping class of actions as invoking UB, and the area of overlap includes some actions which should be processed consistently by all implementations as well as others that would be impractical to support on at least some implementations. Rather than trying to fully describe all cases that should be treated as defined, the authors of the Standard expected that implementations would seek to uphold the Spirit of C described in the Rationale, including the principle "Don't prevent the programmer from doing what needs to be done". This would generally lead to quality implementations giving priority to the definition of behavior when necessary to meet their customer's needs, while giving priority to the "undefinition" of behavior when that would allow optimizations that also serve their customer's needs.
The only way to treat the C or C++ Standard as defining a useful language is to recognize a category of actions whose behavior is described by one part of the Standard and classified as UB by another, and recognizing the treatment of actions in that category as a Quality of Implementation issue outside the jurisdiction of the Standard. The authors of the Standard expected compiler writers to be sensitive to their customers' needs, and thus didn't see conflicts between behavioral definitions and undefinitions as a particular problem. They thus saw no need to define terms like "object", "lvalue", "lifetime", and "access" in ways that could be applied consistently without creating such conflicts, and the definitions they created are thus not usable for purposes of deciding whether or not particular actions should be defined when such conflicts exist.
Consequently, unless or until the Standards recognize more concepts associated with objects and ways of accessing them, the question of whether a quality implementation intended to be suitable for some purpose should be expected to support a certain action will depend upon whether its authors should be expected to recognize that the action would be useful for such purpose.
[expr.ref]/4.2 defines what
E1.E2
means ifE2
is a non-static data member:This defines behavior only for the case where the first expression actually designates an object. Since in your example the first expression designates no object, the behavior is undefined by omission; see [defns.undefined] ("Undefined behavior may be expected when this document omits any explicit definition of behavior...").
You are also misinterpreting what "access" means in the strict aliasing rule. It means "read or modify the value of an object" ([defns.access]). A class member access expression naming a non-static data member neither reads nor modifies the value of any object and therefore is not an "access", and therefore there's never an "access ... through" a glvalue of "an aggregate or union type" by reason of a class member access expression.
[basic.lval]/11.6 is essentially copied from C, where it actually meant something because assigning or copying a
struct
orunion
accesses the object as a whole. It's meaningless in C++ because assignment and copying of class types are performed through special member functions that either performs memberwise copying (and so "accesses" the members individually) or operates on the object representation. See core issue 2051.