Unions and strict aliasing in C11

2019-06-18 23:44发布

Assuming I have a union like this

union buffer {
  struct { T* data; int count; int capacity; };
  struct { void* data; int count; int  capacity; } __type_erased;
};

Will I get into trouble if I mix reads/writes to the anonymous struct members and __type_erased members under C11 aliasing rules?

More specifically, I am interested in the behaviour that occurs if the components are accessed independently (e.g. via different pointers). To illustrate:

grow_buffer(&buffer.__type_erased);
buffer.data[buffer.count] = ...

I have read all the relevant questions I could find, but I am still not 100% clear on this as some people seem to suggest that such behaviour is undefined while others say that it is legal. Furthermore, the information I find is a mix of C++, C99, C11 etc. rules that is quite difficult to digest. Here, I am interested explicitly in the behaviour mandated by C11 and exhibited by popular compilers (Clang, GCC)

Edit: more information

I have now performed some experiments with multiple compilers and decided to share my findings in case someone runs into a similar issue. The background of my question is that I was trying to write a user-friendly high-performance generic dynamic array implementation in plain C. The idea is that array operation is carried out using macros and heavy-duty operations (like growing the array) are performed using an aliased type-erased template struct. E.g., I can have macro like this:

#define ALLOC_ONE(A)\
    (_array_ensure_size(&A.__type_erased, A.count+1), A.count++)

that grows the array if necessary and returns an index of the newly allocated item. The spec (6.5.2.3) states that access to the same location via different union members are allowed. My interpretation of this is that while _array_ensure_size() is not aware of the union type, the compiler should be aware that the member __type_erased can be potentially mutated by a side effect. That is, I'd assume that this should work. However, it seems that this is a grey zone (and to be honest, the spec is really not clear of what constitutes a member access). Apple's latest Clang (clang-800.0.33.1) has no problems with it. The code compiles without warnings and runs as expected. However, when compiled with GCC 5.3.0 the code crashes with a segfault. In fact, I have a strong suspicion that GCC's behaviour is a bug — I tried making union member mutation explicit by removing the mutable pointer ref and adopting a clear functional style, e.g.:

#define ALLOC_ONE(A) \
   (A.__type_erased = _array_ensure_size(A.__type_erased, A.count+1),\
    A.count++)

This again works with Clang, as expected, but crashes GCC again. My conclusion is that advanced type manipulation with unions is a grey area where one should tread carefully.

1条回答
ら.Afraid
2楼-- · 2019-06-19 00:35

The C11 standard says the following:

6.5.2.3 Structure and union members

95) If the member used to read the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called ‘‘type punning’’). This might be a trap representation.

So from the point of view of union field read/write in C11 it is correct. But strict-aliasing is type-based analysis, so its naive implementation can say these read/write operations to be independent. As I understand modern gcc can can detect cases with union fields and avoid such errors.

Aloso you should remember that there are some cases with pointers to union members that are invalid:

The following is not a valid fragment (because the union type is not visible within function f):

struct t1 { int m; };
struct t2 { int m; };
int f(struct t1 *p1, struct t2 *p2)
{
  if (p1->m < 0)
  p2->m = -p2->m;
  return p1->m;
}
int g()
{
  union {
    struct t1 s1;
    struct t2 s2;
  } u;
  /* ... */
  return f(&u.s1, &u.s2);
}

In my opinion using unions for reading/writing in different members is dangerous and it is better to aviod it.

查看更多
登录 后发表回答