Following a question asked here earlier today and multitudes of similary themed questions, I'm here to ask about this problem from stadard's viewpoint.
struct Base
{
int member;
};
struct Derived : Base
{
int another_member;
};
int main()
{
Base* p = new Derived[10]; // (1)
p[1].member = 42; // (2)
delete[] p; // (3)
}
According to standard (1)
is well-formed, because Dervied*
(which is the result of new-expression) can be implicitly converted to Base*
(C++11 draft, §4.10/3):
A prvalue of type “pointer to cv D”, where D is a class type, can be
converted to a prvalue of type “pointer to cv B”, where B is a base
class (Clause 10) of D. If B is an inaccessible (Clause 11) or
ambiguous (10.2) base class of D, a program that necessitates this
conversion is ill-formed. The result of the conversion is a pointer to
the base class subobject of the derived class object. The null pointer
value is converted to the null pointer value of the destination type.
(3)
leads to undefined behaviour because of §5.3.5/3:
In the first alternative (delete object), if the static type of the
object to be deleted is different from its dynamic type, the static
type shall be a base class of the dynamic type of the object to be
deleted and the static type shall have a virtual destructor or the
behavior is undefined. In the second alternative (delete array) if the
dynamic type of the object to be deleted differs from its static type,
the behavior is undefined.
Is (2)
legal according to standard or does it lead to ill-formed program or undefined behaviour?
edit: Better wording
If you look at the expression p[1]
, p
is a Base*
(Base
is a completely-defined type) and 1
is an int
, so according to ISO/IEC 14882:2003 5.2.1 [expr.sub] this expression is valid and identical to *((p)+(1))
.
From 5.7 [expr.add] / 5, when an integer is added to a pointer, the result is only well defined when the pointer points to an element of an array object and the result of the pointer arithmetic also points the an element of that array object or one past the end of the array. p
, however, does not point to an element of an array object, it points at the base class sub-object of a Derived
object. It is the Derived
object that is an array member, not the Base
sub-object.
Note that under 5.7 / 4, for the purposes of the addition operator, the Base
sub-object can be treated as an array of size one, so technically you can form the address p + 1
, but as a "one past the last element" pointer, it doesn't point at a Base
object and attempting to read from or write to it will cause undefined behavior.
(3) leads to undefined behaviour, but it is not ill-formed strictly speaking. Ill-formed means that a C++ program is not constructed according to the syntax rules, diagnosable semantic rules, and the One Definition Rule.
Same for (2), it is well-formed, but it does not do what you have probably expected. According to §8.3.4/6:
Except where it has been declared for a class (13.5.5), the subscript operator [] is interpreted in such a way
that E1[E2] is identical to *((E1)+(E2)). Because of the conversion rules that apply to +, if E1 is an
array and E2 an integer, then E1[E2] refers to the E2-th member of E1. Therefore, despite its asymmetric
appearance, subscripting is a commutative operation.
So in (2) you will get the address which is the result of p+sizeof(Base)*1
when you probably wanted to get the address p+sizeof(Derived)*1
.
The standard doesn't disallow (2), but it's dangerous nevertheless.
The problem is that doing p[1]
means adding sizeof(Base)
to the base address p
, and using the data at that memory location as an instance of Base
. But chances are very high that sizeof(Base)
is smaller than sizeof(Derived)
, so you'll be interpreting a block of memory starting in the middle of a Derived
object, as a Base
object.
More information in C++ FAQ Lite 21.4.
p[1].member = 42;
is well formed. Static type for p
is Derived
and dynamic type is Base
. p[1]
is equivalent to *(p+1)
which seems a valid and is a pointer to first element of dynamic type Base
in array.
However, *(p+1)
in fact refers to an array member of type Derived
. Code p[1].member = 42;
shows you think you are referring to an array member with type Base
.