Is type-punning through a union unspecified in C99

2018-12-31 06:56发布

A number of answers for the Stack Overflow question Getting the IEEE Single-precision bits for a float suggest using a union structure for type punning (e.g.: turning the bits of a float into a uint32_t):

union {
    float f;
    uint32_t u;
} un;
un.f = your_float;
uint32_t target = un.u;

However, the value of the uint32_t member of the union appears to be unspecified according to the C99 standard (at least draft n1124), where section 6.2.6.1.7 states:

When a value is stored in a member of an object of union type, the bytes of the object representation that do not correspond to that member but do correspond to other members take unspecified values.

At least one footnote of the C11 n1570 draft seems to imply that this is no longer the case (see footnote 95 in 6.5.2.3):

If the member used to read the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called ‘‘type punning’’). This might be a trap representation.

However, the text to section 6.2.6.1.7 is the same in the C99 draft as in the C11 draft.

Is this behavior actually unspecified under C99? Has it become specified in C11? I realize that most compilers seem to support this, but it would be nice to know if it's specified in the standard, or just a very common extension.

4条回答
余生请多指教
2楼-- · 2018-12-31 07:32

The behavior of type punning with union changed from C89 to C99. The behavior in C99 is the same as C11.

As Wug noted in his answer, type punning is allowed in C99 / C11. An unspecified value that could be a trap is read when the union members are of different size.

The footnote was added in C99 after Clive D.W. Feather Defect Report #257:

Finally, one of the changes from C90 to C99 was to remove any restriction on accessing one member of a union when the last store was to a different one. The rationale was that the behaviour would then depend on the representations of the values. Since this point is often misunderstood, it might well be worth making it clear in the Standard.

[...]

To address the issue about "type punning", attach a new footnote 78a to the words "named member" in 6.5.2.3#3: 78a If the member used to access the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called "type punning"). This might be a trap representation.

The wording of Clive D.W. Feather was accepted for a Technical Corrigendum in the answer by the C Committee for Defect Report #283.

查看更多
浅入江南
3楼-- · 2018-12-31 07:33

This has always been "iffy". As others have noted a footnote was added to C99 via a Technical Corregendum. It reads as follows:

If the member used to access the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called "type punning"). This might be a trap representation.

However, footnotes are specified in the Foreword as non-normative:

Annexes D and F form a normative part of this standard; annexes A, B, C, E, G, H, I, J, the bibliography, and the index are for information only. In accordance with Part 3 of the ISO/IEC Directives, this foreword, the introduction, notes, footnotes, and examples are also for information only.

That is, the footnotes cannot proscribe behaviour; they should only clarify the existing text. It's an unpopular opinion, but the footnote quoted above actually fails in this regard - there is no such behaviour proscribed in the normative text. Indeed, there are parts such as 6.7.2.1:

... The value of at most one of the members can be stored in a union object at any time

In conjunction with 6.5.2.3 (regarding accessing union members with the "." operator):

The value is that of the named member

I.e. if the value of only one member can be stored, the value of another member is non-existent. This strongly implies that type punning via a union should not be possible; the member access yields a non-existent value. The same text still exists in the C11 document.

However, it's clear that the purpose of adding the footnote was to allow for type-punning; it's just that the committee seemingly broke the rules on footnotes not containing normative text. To accept the footnote, you really have to disregard the section that says footnotes aren't normative, or otherwise try to figure out how to interpret the normative text in such a way that supports the conclusion of the footnote (which I have tried, and failed, to do).

The section you quote:

When a value is stored in a member of an object of union type, the bytes of the object representation that do not correspond to that member but do correspond to other members take unspecified values.

... has to be read carefully, though. "The bytes of the object representation that do not correspond to that member" is referring to bytes beyond the size of the member, which isn't itself an issue for type punning (except that you cannot assume writing to a union member will leave the "extra" part of any larger member untouched).

查看更多
泛滥B
4楼-- · 2018-12-31 07:46

The original C99 specification left this unspecified.

One of the technical corrigenda to C99 (TR2, I think) added footnote 82 to correct this oversight:

If the member used to access the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called "type punning"). This might be a trap representation.

That footnote is retained in the C11 standard (it's footnote 95 in C11).

查看更多
有味是清欢
5楼-- · 2018-12-31 07:48

However, this appears to violate the C99 standard (at least draft n1124), where section 6.2.6.1.7 states some stuff. Is this behavior actually unspecified under C99?

No, you're fine.

When a value is stored in a member of an object of union type, the bytes of the object representation that do not correspond to that member but do correspond to other members take unspecified values.

This applies to data blocks of different sizes. I.e, if you have:

union u
{
    float f;
    double d;
};

and you assign something to f, it would change the lower 4 bytes of d, but the upper 4 bytes would be in an indeterminate state.

Unions exist primarily for type punning.

查看更多
登录 后发表回答