Dangling references and undefined behavior

2019-01-11 19:30发布

Assume a dangling reference x. Is it undefined behavior to just write

&x;

or even

x;

?

3条回答
Fickle 薄情
2楼-- · 2019-01-11 19:49

What makes the use of an invalid object (reference, pointer, whatever) undefined behaviour is lvalue-to-rvalue conversion (§4.1):

If the object to which the glvalue refers is not an object of type T and is not an object of a type derived from T, or if the object is uninitialized, a program that necessitates this conversion has undefined behavior.

Assuming we haven't overloaded operator&, the unary & operator takes an lvalue as its operand, so no conversion occurs. Having just an identifier, as in x; also requires no conversion. You will only get undefined behaviour when the reference is used as an operand in an expression that expects that operand to be an rvalue - which is the case for most operators. The point is, doing &x doesn't actually require accessing the value of x. Lvalue-to-rvalue conversion occurs with those operators that need to access its value.

I believe your code is well defined.

When operator& has been overloaded, the expression &x is transformed into a function call and does not obey the rules of the built-in operators - instead it follows the rules of a function call. For &x, the translation to function call results in either x.operator&() or operator&(x). In the first case, lvalue-to-rvalue conversion will occur on x when the class member access operator is used. In the second case, the argument of operator& will be copy-initialised with x (as in T arg = x), and the behaviour of this depends on the type of the argument. For example, in the case of the argument being an lvalue reference, there is no undefined behaviour because lvalue-to-rvalue conversion does not occur.

So if operator& is overloaded for the type of x, the code may or may not be well-defined, depending on the calling of the operator& function.

You could argue that the unary & operator relies on there being at least some valid region of storage that you have the address of:

Otherwise, if the type of the expression is T, the result has type "pointer to T" and is a prvalue that is the address of the designated object

And an object is defined as being a region of storage. After the object that is referred to is destroyed, that region of storage no longer exists.

I prefer to believe that it will only result in undefined behaviour if the invalid object is actually accessed. The reference still believes it's referring to some object and it can happily give the address of it even if it doesn't exist. However, this seems to be an ill-specified part of the standard.


Aside

As an example of undefined behaviour, consider x + x. Now we hit another ill-specified part of the standard. The value category of the operands of + are not specified. It is generally inferred from §5/8 that if it is not specified, then it expects a prvalue:

Whenever a glvalue expression appears as an operand of an operator that expects a prvalue for that operand, the lvalue-to-rvalue (4.1), array-to-pointer (4.2), or function-to-pointer (4.3) standard conversions are applied to convert the expression to a prvalue.

Now because x is an lvalue, the lvalue-to-rvalue conversion is required and we get undefined behaviour. This makes sense because addition requires accessing the value of x so it can work out the result.

查看更多
一夜七次
3楼-- · 2019-01-11 19:54

Supposing that x was initialized with a valid object, which was then destroyed, §3.8/6 applies:

Similarly, before the lifetime of an object has started but after the storage which the object will occupy has been allocated or, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, any glvalue that refers to the original object may be used but only in limited ways. For an object under construction or destruction, see 12.7. Otherwise, such a glvalue refers to allocated storage (3.7.4.2), and using the properties of the glvalue that do not depend on its value is well-defined. The program has undefined behavior if:

— an lvalue-to-rvalue conversion (4.1) is applied to such a glvalue,

— the glvalue is used to access a non-static data member or call a non-static member function of the object, or

— the glvalue is bound to a reference to a virtual base class (8.5.3), or

— the glvalue is used as the operand of a dynamic_cast (5.2.7) or as the operand of typeid.

So, simply taking the address is well-defined, and (referring to the neighboring paragraphs) can even be productively used to create a new object in place of the old one.

As for not taking the address and just writing x, that really does absolutely nothing, and it is a proper subexpression of &x. So it's also OK.

查看更多
放我归山
4楼-- · 2019-01-11 20:08

First off, very interesting question.

I would say it is undefined behaviour, assuming "dangling reference" means "referred-to object's lifetime has ended and the storage the object occupied has been reused or released." I base my reasoning on the following standard rulings:

3.8 §3:

The properties ascribed to objects throughout this International Standard apply for a given object only during its lifetime. [ Note: In particular, before the lifetime of an object starts and after its lifetime ends there are significant restrictions on the use of the object, as described below ...]

All the cases "as described below" refer to

Before the lifetime of an object has started but after the storage which the object will occupy has been allocated38 or, after the lifetime of an object has ended and before the storage which the object occupied is reused or released

1.3.24: undefined behavior

behavior for which this International Standard imposes no requirements [ Note: Undefined behavior may be expected when this International Standard omits any explicit definition of behavior or when a program uses an erroneous construct or erroneous data. ...]

I apply the following train of thoughts to the above quotes:

  1. If the standard doesn't describe behaviour for a situation, the behvaiour is undefined.
  2. The standard only describes behvaiour for objects within their lifetime, and a few special cases near the start/end of their lifetime. None of these apply to our dangling reference.
  3. Therefore, using the danling reference in any way has no behaviour prescribed by the standard, hence the behaviour is undefined.
查看更多
登录 后发表回答