Violating strict-aliasing, even without any castin

2019-06-03 12:48发布

问题:

I think I'm really asking: is aliasing "transitive"? If the compiler knows that A might alias B, and B might alias C, then surely it should remember that A might therefore alias C. Perhaps this "obvious" transitive logic isn't required however?

An example, for clarity. The most interesting example, to me, of a strict-aliasing issue:

// g++    -fstrict-aliasing -std=c++11 -O2
#include <iostream>

union
{   
    int i;
    short s;
} u;
int     * i = &u.i;

int main()
{   

    u.i = 1; // line 1
    *i += 1; // line 2

    short   & s =  u.s;
    s += 100; // line 3

    std::cout
        << " *i\t" <<  *i << std::endl // prints 2
        << "u.i\t" << u.i << std::endl // prints 101
        ;

    return 0;
}

g++ 5.3.0, on x86_64 (but not clang 3.5.0) gives the above output, where *i and u.i give different numbers. But they should give exactly the same number, because i is defined at int * i = &u.i; and i doesn't change.

I have a theory: When 'predicting' the value of u.i, the compiler asks which lines might affect the contents of u.i. That includes line 1 obviously. And line 2 because int* can alias an int member of a union. And line 3 also, because anything that can affect one union member (u.s) can affect another member of the same union. But when predicting *i it doesn't realise that line 3 can affect the int lvalue at *i.

Does this theory seem reasonable?

I find this example funny because I don't have any casting in it. I managed to break strict-aliasing with doing any casting.

回答1:

Reading from inactive member of a union is undefined in C++. (It's legit in C99 and C11).

So, all in all, the compiler isn't required to assume/remember anything.

Standardese:

N4140 §9.5[class.union]/1

In a union, at most one of the non-static data members can be active at any time, that is, the value of at most one of the non-static data members can be stored in a union at any time.



回答2:

It is only allowed to read from the union member that was last written to in C++.

Aliasing outside unions is only allowed between 'similar' types (for details please see this Q/A), and char/unsigned char. It is only allowed to alias another type through char/unsigned char, but it is not allowed to alias char/unsigned char through other types. If the latter was allowed, then all objects would have to be treated as possibly aliasing any other object, because they could be 'transitively aliased' like you describe through char/unsigned char.

But because this is not the case, the compiler can safely assume that only objects of 'similar' types and char/unsigned char alias each other.