What makes the use of an invalid object (reference, pointer, whatever) undefined behaviour is lvalue-to-rvalue conversion (§4.1):
If the object to which the glvalue refers is not an object of type T and is not an object of a type derived from T, or if the object is uninitialized, a program that necessitates this conversion has undefined behavior.
Assuming we haven't overloaded operator&, the unary & operator takes an lvalue as its operand, so no conversion occurs. Having just an identifier, as in x; also requires no conversion. You will only get undefined behaviour when the reference is used as an operand in an expression that expects that operand to be an rvalue - which is the case for most operators. The point is, doing &x doesn't actually require accessing the value of x. Lvalue-to-rvalue conversion occurs with those operators that need to access its value.
I believe your code is well defined.
When operator& has been overloaded, the expression &x is transformed into a function call and does not obey the rules of the built-in operators - instead it follows the rules of a function call. For &x, the translation to function call results in either x.operator&() or operator&(x). In the first case, lvalue-to-rvalue conversion will occur on x when the class member access operator is used. In the second case, the argument of operator& will be copy-initialised with x (as in T arg = x), and the behaviour of this depends on the type of the argument. For example, in the case of the argument being an lvalue reference, there is no undefined behaviour because lvalue-to-rvalue conversion does not occur.
So if operator& is overloaded for the type of x, the code may or may not be well-defined, depending on the calling of the operator& function.
You could argue that the unary & operator relies on there being at least some valid region of storage that you have the address of:
Otherwise, if the type of the expression is T, the result has type "pointer to T" and is a prvalue that is the address of the designated object
And an object is defined as being a region of storage. After the object that is referred to is destroyed, that region of storage no longer exists.
I prefer to believe that it will only result in undefined behaviour if the invalid object is actually accessed. The reference still believes it's referring to some object and it can happily give the address of it even if it doesn't exist. However, this seems to be an ill-specified part of the standard.
Aside
As an example of undefined behaviour, consider x + x. Now we hit another ill-specified part of the standard. The value category of the operands of + are not specified. It is generally inferred from §5/8 that if it is not specified, then it expects a prvalue:
Whenever a glvalue expression appears as an operand of an operator that expects a prvalue for that operand, the lvalue-to-rvalue (4.1), array-to-pointer (4.2), or function-to-pointer (4.3) standard conversions are applied to convert the expression to a prvalue.
Now because x is an lvalue, the lvalue-to-rvalue conversion is required and we get undefined behaviour. This makes sense because addition requires accessing the value of x so it can work out the result.
Supposing that x was initialized with a valid object, which was then destroyed, §3.8/6 applies:
Similarly, before the lifetime of an object has started but after the storage which the object will occupy has been allocated or, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, any glvalue that refers to the original object may be used but only in limited ways. For an object under construction or destruction, see 12.7. Otherwise, such a glvalue refers to allocated storage (3.7.4.2), and using the properties of the glvalue that do not depend on its value is well-defined. The program has undefined behavior if:
— an lvalue-to-rvalue conversion (4.1) is applied to such a glvalue,
— the glvalue is used to access a non-static data member or call a non-static member function of the
object, or
— the glvalue is bound to a reference to a virtual base class (8.5.3), or
— the glvalue is used as the operand of a dynamic_cast (5.2.7) or as the operand of typeid.
So, simply taking the address is well-defined, and (referring to the neighboring paragraphs) can even be productively used to create a new object in place of the old one.
As for not taking the address and just writing x, that really does absolutely nothing, and it is a proper subexpression of &x. So it's also OK.
I would say it is undefined behaviour, assuming "dangling reference" means "referred-to object's lifetime has ended and the storage the object occupied has been reused or released." I base my reasoning on the following standard rulings:
3.8 §3:
The properties ascribed to objects throughout this International Standard apply for a given object only
during its lifetime. [ Note: In particular, before the lifetime of an object starts and after its lifetime ends
there are significant restrictions on the use of the object, as described below ...]
All the cases "as described below" refer to
Before the lifetime of an object has started but after the storage which the object will occupy has been
allocated38 or, after the lifetime of an object has ended and before the storage which the object occupied is
reused or released
1.3.24:
undefined behavior
behavior for which this International Standard imposes no requirements
[ Note: Undefined behavior may be expected when this International Standard omits any explicit definition of
behavior or when a program uses an erroneous construct or erroneous data. ...]
I apply the following train of thoughts to the above quotes:
If the standard doesn't describe behaviour for a situation, the behvaiour is undefined.
The standard only describes behvaiour for objects within their lifetime, and a few special cases near the start/end of their lifetime. None of these apply to our dangling reference.
Therefore, using the danling reference in any way has no behaviour prescribed by the standard, hence the behaviour is undefined.
What makes the use of an invalid object (reference, pointer, whatever) undefined behaviour is lvalue-to-rvalue conversion (§4.1):
Assuming we haven't overloaded
operator&
, the unary&
operator takes an lvalue as its operand, so no conversion occurs. Having just an identifier, as inx;
also requires no conversion. You will only get undefined behaviour when the reference is used as an operand in an expression that expects that operand to be an rvalue - which is the case for most operators. The point is, doing&x
doesn't actually require accessing the value ofx
. Lvalue-to-rvalue conversion occurs with those operators that need to access its value.I believe your code is well defined.
When
operator&
has been overloaded, the expression&x
is transformed into a function call and does not obey the rules of the built-in operators - instead it follows the rules of a function call. For&x
, the translation to function call results in eitherx.operator&()
oroperator&(x)
. In the first case, lvalue-to-rvalue conversion will occur onx
when the class member access operator is used. In the second case, the argument ofoperator&
will be copy-initialised withx
(as inT arg = x
), and the behaviour of this depends on the type of the argument. For example, in the case of the argument being an lvalue reference, there is no undefined behaviour because lvalue-to-rvalue conversion does not occur.So if
operator&
is overloaded for the type ofx
, the code may or may not be well-defined, depending on the calling of theoperator&
function.You could argue that the unary
&
operator relies on there being at least some valid region of storage that you have the address of:And an object is defined as being a region of storage. After the object that is referred to is destroyed, that region of storage no longer exists.
I prefer to believe that it will only result in undefined behaviour if the invalid object is actually accessed. The reference still believes it's referring to some object and it can happily give the address of it even if it doesn't exist. However, this seems to be an ill-specified part of the standard.
Aside
As an example of undefined behaviour, consider
x + x
. Now we hit another ill-specified part of the standard. The value category of the operands of+
are not specified. It is generally inferred from §5/8 that if it is not specified, then it expects a prvalue:Now because
x
is an lvalue, the lvalue-to-rvalue conversion is required and we get undefined behaviour. This makes sense because addition requires accessing the value ofx
so it can work out the result.Supposing that
x
was initialized with a valid object, which was then destroyed, §3.8/6 applies:So, simply taking the address is well-defined, and (referring to the neighboring paragraphs) can even be productively used to create a new object in place of the old one.
As for not taking the address and just writing
x
, that really does absolutely nothing, and it is a proper subexpression of&x
. So it's also OK.First off, very interesting question.
I would say it is undefined behaviour, assuming "dangling reference" means "referred-to object's lifetime has ended and the storage the object occupied has been reused or released." I base my reasoning on the following standard rulings:
3.8 §3:
All the cases "as described below" refer to
1.3.24: undefined behavior
I apply the following train of thoughts to the above quotes: