Expressions \"j = ++(i | i); and j = ++(i & i); sh

2019-01-11 00:28发布

问题:

I was expecting that in my following code:

#include<stdio.h> 
int main(){
    int i = 10; 
    int j = 10;

    j = ++(i | i);
    printf("%d %d\n", j, i);

    j = ++(i & i);
    printf("%d %d\n", j, i);

    return 1;
}

expressions j = ++(i | i); and j = ++(i & i); will produce lvalue errors as below:

x.c: In function ‘main’:
x.c:6: error: lvalue required as increment operand
x.c:9: error: lvalue required as increment operand   

But I surprised that above code compiled successfully, as below:

~$ gcc x.c -Wall
~$ ./a.out 
11 11
12 12   

Check the above code working correctly.

While other operators produce error (as I understand). Even bitwise operator XOR causes of an error j = ++(i ^ i); (check other operators produce an lvalue error at compilation time).

What is the reason? Is this is unspecified or undefined ? or bitwise OR AND operators are different?

compiler version:

gcc version 4.4.5 (Ubuntu/Linaro 4.4.4-14ubuntu5)

But I believe compiler version shouldn't reason for non-uniform behavior. If ^ not compiled then | and & also not. otherwise should work for all

Its not an error with this compiler in c99 mode: gcc x.c -Wall -std=c99.

回答1:

You are right that it should not compile, and on most compilers, it does not compile.
(Please specify exactly which compiler/version is NOT giving you a compiler error)

I can only hypothesize that the compiler knows the identities that (i | i) == i and (i & i) == i and is using those identities to optimize away the expression, just leaving behind the variable i.

This is just a guess, but it makes a lot of sense to me.



回答2:

This is a bug that has been addressed in more recent GCC versions.

It's probably because the compiler optimizes i & i to i and i | i to i. This also explains why the xor operator didn't work; i ^ i would be optimized to 0, which is not a modifiable lvalue.



回答3:

C11 (n1570), § 6.5.3.1 Prefix increment and decrement operators
The operand of the prefix increment or decrement operator shall have atomic, qualified, or unqualified real or pointer type, and shall be a modifiable lvalue.

C11 (n1570), § 6.3.2.1 Lvalues, arrays, and function designators
A modifiable lvalue is an lvalue that does not have array type, does not have an incomplete type, does not have a const- qualified type, and if it is a structure or union, does not have any member (including, recursively, any member or element of all contained aggregates or unions) with a const- qualified type.

C11 (n1570), § 6.3.2.1 Lvalues, arrays, and function designators
An lvalue is an expression (with an object type other than void) that potentially designates an object.

C11 (n1570), § 3. Terms, definitions, and symbols
Object: Region of data storage in the execution environment, the contents of which can represent values

As far as I know, potentially means "capable of being but not yet in existence". But (i | i) is not capable of referencing a region a data storage in the execution environment. Therefore it is not an lvalue. This seems to be a bug in an old gcc version, fixed since. Update your compiler!



回答4:

Just a follow-up to my question . I added elaborate answer so that one can find it helpful.

In my code expressions j = ++(i | i); and j = ++(i & i); are not caused for lvalue error ?

Because of compiler optimization as @abelenky answered (i | i) == i and (i & i) == i. That is exactly CORRECT.

In my compiler (gcc version 4.4.5), any expression that includes single variable and result is unchanged; optimized into a single variable (something called not an expression).

for example:

j = i | i      ==> j = i
j = i & i      ==> j = i
j = i * 1      ==> j = i
j = i - i + i  ==> j = i 

==> means optimized to

To observe it I written a small C code and disassemble that with gcc -S.

C-Code: (read comments)

#include<stdio.h>
int main(){
    int i = 10; 
    int j = 10;
    j = i | i;      //==> j = i
        printf("%d %d", j, i);
    j = i & i;      //==> j = i
        printf("%d %d", j, i);
    j = i * 1;      //==> j = i
    printf("%d %d", j, i);
    j = i - i + i;  //==> j = i
    printf("%d %d", j, i);
}

assembly output: (read comments)

main:
    pushl   %ebp
    movl    %esp, %ebp
    andl    $-16, %esp
    subl    $32, %esp
    movl    $10, 28(%esp)   // i 
    movl    $10, 24(%esp)   // j

    movl    28(%esp), %eax  //j = i
    movl    %eax, 24(%esp)

    movl    $.LC0, %eax
    movl    28(%esp), %edx
    movl    %edx, 8(%esp)
    movl    24(%esp), %edx
    movl    %edx, 4(%esp)
    movl    %eax, (%esp)
    call    printf

    movl    28(%esp), %eax  //j = i
    movl    %eax, 24(%esp)

    movl    $.LC0, %eax
    movl    28(%esp), %edx
    movl    %edx, 8(%esp)
    movl    24(%esp), %edx
    movl    %edx, 4(%esp)
    movl    %eax, (%esp)
    call    printf

    movl    28(%esp), %eax  //j = i
    movl    %eax, 24(%esp)

    movl    $.LC0, %eax
    movl    28(%esp), %edx
    movl    %edx, 8(%esp)
    movl    24(%esp), %edx
    movl    %edx, 4(%esp)
    movl    %eax, (%esp)
    call    printf

    movl    28(%esp), %eax  //j = i
    movl    %eax, 24(%esp)

    movl    $.LC0, %eax
    movl    28(%esp), %edx
    movl    %edx, 8(%esp)
    movl    24(%esp), %edx
    movl    %edx, 4(%esp)
    movl    %eax, (%esp)
    call    printf  

In above assembly code all expressions converted into following code:

movl    28(%esp), %eax  
movl    %eax, 24(%esp)

that is equivalent to j = i in C code. Thus j = ++(i | i); and and j = ++(i & i); are optimized to j = ++i.

Notice: j = (i | i) is a statement where as expression (i | i) not a statement (nop) in C

Hence my code could successfully compiled.

Why j = ++(i ^ i); or j = ++(i * i); , j = ++(i | k); produce lvalue error on my compiler?

Because either expression has the constant value or not modifiable lvalue (unoptimized expression).

we can observe using asm code

#include<stdio.h> 
int main(){
    int i = 10; 
    int j = 10;
    j = i ^ i;
    printf("%d %d\n", j, i);
    j = i - i;
    printf("%d %d\n", j, i);
    j =  i * i;
    printf("%d %d\n", j, i);
    j =  i + i;
    printf("%d %d\n", j, i);        
    return 1;
}

assembly code: (read comments)

main:
    pushl   %ebp
    movl    %esp, %ebp
    andl    $-16, %esp
    subl    $32, %esp
    movl    $10, 28(%esp)      // i
    movl    $10, 24(%esp)      // j

    movl    $0, 24(%esp)       // j = i ^ i;
                               // optimized expression i^i = 0
    movl    $.LC0, %eax
    movl    28(%esp), %edx
    movl    %edx, 8(%esp)
    movl    24(%esp), %edx
    movl    %edx, 4(%esp)
    movl    %eax, (%esp)
    call    printf

    movl    $0, 24(%esp)      //j = i - i;
                              // optimized expression i - i = 0
    movl    $.LC0, %eax
    movl    28(%esp), %edx
    movl    %edx, 8(%esp)
    movl    24(%esp), %edx
    movl    %edx, 4(%esp)
    movl    %eax, (%esp)
    call    printf

    movl    28(%esp), %eax    //j =  i * i;
    imull   28(%esp), %eax
    movl    %eax, 24(%esp)

    movl    $.LC0, %eax
    movl    28(%esp), %edx
    movl    %edx, 8(%esp)
    movl    24(%esp), %edx
    movl    %edx, 4(%esp)
    movl    %eax, (%esp)
    call    printf

    movl    28(%esp), %eax   // j =  i + i;
    addl    %eax, %eax
    movl    %eax, 24(%esp)

    movl    $.LC0, %eax
    movl    28(%esp), %edx
    movl    %edx, 8(%esp)
    movl    24(%esp), %edx
    movl    %edx, 4(%esp)
    movl    %eax, (%esp)
    call    printf

    movl    $1, %eax
    leave

Hence so this produce an lvalue error because operand is not a modifiable lvalue. And non-uniform behavior is due to compiler optimization in gcc-4.4.

Why new gcc compliers(or most of compilers) produce an lvalue error?

Because evaluation of expression ++(i | i) and ++(i & i) prohibits actual defination of increment(++) operator.

According to Dennis M. Ritchie's book "The C Programming Language" in section "2.8 Increment and Decrement Operators" page 44.

The increment and decrement operators can only be applied to variables; an expression like (i+j)++ is illegal. The operand must be a modifiable lvalue of arithmetic or pointer type.

I tested on new gcc compiler 4.47 here it produces error as I was expecting. I also tested on tcc compiler.

Any feedback/ comments on this would be great.



回答5:

I don't think at all it is an optimization error, because if it was, then there should not be any error in the first place. If ++(i | i) is optimized to ++(i), then there should not be any error, because (i) is an lvalue.

IMHO, I think that the compiler sees (i | i) as an expression output, that, obviously, outputs rvalue, but the increment operator ++ expects an lvalue to change it, thus the error.