Why is taking the address of a temporary illegal?

2020-02-04 19:51发布

问题:

I know that the code written below is illegal

void doSomething(std::string *s){}
int main()
{
     doSomething(&std::string("Hello World"));
     return 0;
}

The reason is that we are not allowed to take the address of a temporary object. But my question is WHY?

Let us consider the following code

class empty{};
int main()
{
      empty x = empty(); //most compilers would elide the temporary
      return 0;
}

The accepted answer here mentions

"usually the compiler consider the temporary and the copy constructed as two objects that are located in the exact same location of memory and avoid the copy."

According to the statement it can be concluded that the temporary was present in some memory location( hence its address could have been taken) and the compiler decided to eliminate the temporary by creating an in-place object at the same location where the temporary was present.

Does this contradict the fact that the address of a temporary cannot be taken?

I would also like to know how is return value optimization implemented. Can someone provide a link or an article related to RVO implementation?

回答1:

&std::string("Hello World")

The problem with this isn't that std::string("Hello World") yields a temporary object. The problem is that the expression std::string("Hello World") is an rvalue expression that refers to a temporary object.

You cannot take the address of an rvalue because not all rvalues have addresses (and not all rvalues are objects). Consider the following:

42

This is an integer literal, which is a primary expression and an rvalue. It is not an object, and it (likely) does not have an address. &42 is nonsensical.

Yes, an rvalue may refer to an object, as is the case in your first example. The problem is that not all rvalues refer to objects.



回答2:

Long answer:

[...] it can be concluded that the temporary was present in some memory location

By definition:

  • "temporary" stands for: temporary object
  • an object occupies a region of storage
  • all objects have an address

So it doesn't take a very elaborate proof to show that a temporary has an address. This is by definition.

OTOH, you are not just fetching the address, you are using the builtin address-of operator. The specification of the builtin address-of operator says that you must have a lvalue:

  • &std::string() is ill-formed because std::string() is a rvalue. At runtime, this evaluation of this expression creates a temporary object as a side-effect, and the expression yield a rvalue that refers to the object created.
  • &(std::string() = "Hello World") is well-formed because std::string() = "Hello World" is a lvalue. By definition, a lvalue refers to an object. The object this lvalue refers to is the exact same temporary

Short answer:

This is the rule. It doesn't need the (incorrect, unsound) justifications some people are making up.



回答3:

$5.3.1/2 - "The result of the unary & operator is a pointer to its operand. The operand shall be an lvalue or a qualifiedid."

Expressions such as

99

A() // where A is a user defined class with an accessible 
    // and unambiguous default constructor

are all Rvalues.

$3.10/2 - "An lvalue refers to an object or function. Some rvalue expressions—those of class or cv-qualified class type—also refer to objects.47)"

And this is my guess: Even though Rvalues may occupy storage (e.g in case of objects), C++ standard does not allow taking their address to maintain uniformity with the built-in types

Here's something interesting though:

void f(const double &dbl){
   cout << &dbl;
}

int main(){
   f(42);
}

The expression '42' is an Rvalue which is bound to the 'reference to const double' and hence it creates a temporary object of type double. The address of this temporary can be taken inside the function 'f'. But note that inside 'f' this is not really a temporary or a Rvalue. The moment it is given a name such as 'dbl', it is treated as an Lvalue expression inside 'f'.

Here's something on NRVO (similar)



回答4:

A temporary is an example of a C++ "rvalue." It is supposed to purely represent a value within its type. For example, if you write 42 in two different places in your program, the instances of 42 are indistinguishable despite probably being in different locations at different times. The reason you can't take the address is that you need to do something to specify that there should be an address, because otherwise the concept of an address is semantically unclean and unintuitive.

The language requirement that you "do something" is somewhat arbitrary, but it makes C++ programs cleaner. It would suck if people made a habit of taking addresses of temporaries. The notion of an address is intimately bound with the notion of a lifetime, so it makes sense to make "instantaneous" values lack addresses. Still, if you are careful, you can acquire an address and use it within the lifetime that the standard does allow.

There are some fallacies in the other answers here:

  • "You cannot take the address of an rvalue because not all rvalues have addresses." — Not all lvalues have addresses either. A typical local variable of type int which participates in a simple loop and is subsequently unused will likely be assigned a register but no stack location. No memory location means no address. The compiler will assign it a memory location if you take its address, though. The same is true of rvalues, which may be bound to const references. The "address of 42" may be acquired as such:

    int const *fortytwo_p = & static_cast<int const &>( 42 );
    

    Of course, the address is invalid after the ; because temporaries are temporary, and this is likely to generate extra instructions as the machine may pointlessly store 42 onto the stack.

    It's worth mentioning that C++0x cleans up the concepts by defining the prvalue to be the value of the expression, independent of storage, and the glvalue to be the storage location independent of its contents. This was probably the intent of the C++03 standard in the first place.

  • "Then you could modify the temporary, which is pointless." — Actually temporaries with side effects are useful to modify. Consider this:

    if ( istringstream( "42" ) >> my_int )
    

    This is a nice idiom for converting a number and checking that the conversion succeeded. It involves creating a temporary, calling a mutating function on it, and then destroying it. Far from pointless.



回答5:

It can be taken, but once the temporary ceases to exist, you have a dangling pointer left.

EDIT

For the downvoters:

const std::string &s = std::string("h");
&s;

is legal. s is a reference to a temporary. Hence, a temporary object's address can be taken.

EDIT2

Bound references are aliases to what they are bound to. Hence, a reference to a temporary is another name for that temporary. Hence, the second statement in the paragraph above holds.

OP's question is about temporaries (in terms of the words he uses), and his example is about rvalues. These are two distinct concepts.



回答6:

One reason is that your example would give the method write access to the temporary, which is pointless.

The citation you provided isn't about this situation, it is a specific optimization that is permitted in declarators with initializers.



回答7:

Why is taking the address of a temporary illegal?

The scope of temporary variables are limited to some particular method or some block, as soon as the method call returns the temporary variables are removed from the memory, so if we return the address of a variable which no longer exists in the memory it does not make sense. Still the address is valid but that address may now contain some garbage value.