Casting a Struct to an Array [duplicate]

2020-04-14 07:23发布

问题:

This is an strict aliasing question, as in will the compiler cause any optimization order problems with this.

Say that I have three public floats in a struct XMFLOAT3 (not unlike this one.) And I want to cast to a float*. Will this land me in optimization trouble?

XMFLOAT3 foo = {1.0f, 2.0f, 3.0f};
auto bar = &foo.x;

bar[2] += 5.0f;
foo.z += 5.0f;
cout << foo.z;

I assume this will always print "13". But what about this code:

XMFLOAT3 foo = {1.0f, 2.0f, 3.0f};
auto bar = reinterpret_cast<float*>(&foo);

bar[2] += 5.0f;
foo.z += 5.0f;
cout << foo.z;

I believe this is legal because, according to http://en.cppreference.com/w/cpp/language/reinterpret_cast#Type_aliasing

T2 is an aggregate type or a union type which holds one of the aforementioned types as an element or non-static member (including, recursively, elements of subaggregates and non-static data members of the contained unions): this makes it safe to cast from the first member of a struct and from an element of a union to the struct/union that contains it.

Is my understanding of this correct?

Obviously this will become implementation dependent on the declaration of XMFLOAT3.

回答1:

The reinterpret_cast from XMFLOAT3* to float* is OK, due to:

9.2 [class.mem] paragraph 20:

If a standard-layout class object has any non-static data members, its address is the same as the address of its first non-static data member. Otherwise, its address is the same as the address of its first base class subobject (if any). [ Note: There might therefore be unnamed padding within a standard-layout struct object, but not at its beginning, as necessary to achieve appropriate alignment. — end note ]

That means the address of the first member is the address of the struct, and there's no aliasing involved when you access *bar because you're accessing a float through an lvalue of type float, which is fine.

But the cast is also unnecessary, it's equivalent to the first version:

auto bar = &foo.x;

The expression bar[2] is only OK if there is no padding between the members of the struct, or more precisely, if the layout of the data members is the same as an array float[3], in which case 3.9.2 [basic.compound] paragraph 3 says it is OK:

A valid value of an object pointer type represents either the address of a byte in memory (1.7) or a null pointer (4.10). If an object of type T is located at an address A, a pointer of type cv T* whose value is the address A is said to point to that object, regardless of how the value was obtained.

In practice there is no reason that three adjacent non-static data members of the same type would not be laid out identically to an array (and I think the Itanium ABI guarantees it), but to be safe you could add:

 static_assert(sizeof(XMFLOAT3)==sizeof(float[3]),
     "XMFLOAT3 layout must be compatible with float[3]");

Or to be paranoid, or if there are just additional members after z:

 static_assert(offsetof(XMFLOAT3, y)==sizeof(float)
               && offsetof(XMFLOAT3, z)==sizeof(float)*2,
     "XMFLOAT3 layout must be compatible with float[3]");

Obviously this will become implementation dependent on the declaration of XMFLOAT3.

Yes, it relies on it being a standard-layout class type, and on the order and type of its data members.



回答2:

It's completely valid; this has nothing to do with strict aliasing whatsoever.

Strict aliasing rules require that pointers aliasing each other have compatible types;
clearly, float* is compatible with float*.



回答3:

Consider a reasonably smart compiler:

XMFLOAT3 foo = {1.0f, 2.0f, 3.0f}; 
auto bar = &foo.x;

bar[2] += 5.0f;
foo.z += 5.0f; // Since no previous expression referenced .z, I know .z==8.0
cout << foo.z; // So optimize this to a hardcoded cout << 8.0f

Replacing variable accesses and operations by known results is a common optimization. Here the optimizer sees three uses of .z : the initial assignment, the increment and the final use. It can trivially determine the values at these three points, and substitute those.

Because struct members cannot overlap (unlike unions), bar which is derived from .x cannot overlap .z so .bar[2] cannot affect .z.

As you see, a perfectly normal optimizer can produce the "wrong" result.