How does dereference work C++

2019-08-08 21:19发布

问题:

I have trouble understanding what happens when calling &*pointer

int j=8;
int* p = &j;

When I print in my compiler I get the following

j = 8 , &j =  00EBFEAC  p = 00EBFEAC , *p = 8 , &p =  00EBFEA0 
&*p= 00EBFEAC

cout << &*p gives  &*p = 00EBFEAC which is p itself

& and * have same operator precedence.I thought &*p would translate to &(*p)--> &(8) and expected compiler error.

How does compiler deduce this result?

回答1:

You are stumbling over something interesting: Variables, strictly spoken, are not values, but refer to values. 8 is an integer value. After int i=8, i refers to an integer value. The difference is that it could refer to a different value.

In order to obtain the value, i must be dereferenced, i.e. the value stored in the memory location which i stands for must be obtained. This dereferencing is performed implicitly in C whenever a value of the type which the variable references is requested: i=8; printf("%d", i) results in the same output as printf("%d", 8). That is funny because variables are essentially aliases for addresses, while numeric literals are aliases for immediate values. In C these very different things are syntactically treated identically. A variable can stand in for a literal in an expression and will be automatically dereferenced. The resulting machine code makes that very clear. Consider the two functions below. Both have the same return type, int. But f has a variable in the return statement which must be dereferenced so that its value can be returned (in this case, it is returned in a register):

int i = 1;
int g(){ return 1; }  // literal
int f(){ return i; }  // variable

If we ignore the housekeeping code, the functions each translate into a sigle machine instruction. The corresponding assembler (from icc) is for g:

    movl      $1, %eax                                      #5.17

That's pretty starightforward: Put 1 in the register eax.

By contrast, f translates to

    movl      i(%rip), %eax                                 #4.17

This puts the value at the address in register rip plus offset i in the register eax. It's refreshing to see how a variable name is just an address (offset) alias to the compiler.

The necessary dereferencing should now be obvious. It would be more logical to write return *i in order to return 1, and write return i only for functions which return references — or pointers.

In your example it is indeed illogical to a degree that

int j=8;
int* p = &j;
printf("%d\n", *p);

prints 8 (i.e, p is actually dereferenced twice); but that &(*p) yields the address of the object pointed to by p (which is the address value stored in p), and is not interpreted as &(8). The reason is that in the context of the address operator a variable (or, in this case, the L-value obtained by dereferencing p) is not implicitly dereferenced the way it is in other contexts.

When the attempt was made to create a logical, orthogonal language — Algol68 —, int i=8 indeed declared an alias for 8. In order to declare a variable the long form would have been refint m = loc int := 3. Consequently what we call a pointer or reference would have had the type ref ref int because actually two dereferences are needed to obtain an integer value.



回答2:

j is an int with value 8 and is stored in memory at address 00EBFEAC.

&j gives the memory address of variable j (00EBFEAC).

int* p = &j Here you define a variable p which you define being of type int *, namely a value of an address in memory where it can find an int. You assign it &j, namely an address of an int -> which makes sense.

*p gives you the value associated with the address stored in p. The address stored in p points to an int, so *p gives you the value of that int, namely 8.

& p is the address of where the variable p itself is stored

&*p gives you the address of the value the memory address stored in p points to, which is indeed p again. &(*p) -> &(j) -> 00EBFEAC



回答3:

Think about &j itself (or even &(j)). According to your logic, shouldn't j evaluate to 8 and result in &8, as well? Dereferencing a pointer or evaluating a variable results in an lvalue, which is a value that you can assign to or take the address of.

The L in "lvalue" refers to the left in "left hand side of the assignment", such as j = 10 or *p = 12. There are also rvalues, such as j + 10, or 8, which obviously cannot be assigned to.

That's just a basic explanation. In C++ there's a lot more to it, with various classes of values (but that thread might be too advanced for your current needs).