A bug in GCC implementation of bit-fields

2019-04-06 04:19发布

Working in C11, the following struct:

struct S {
  unsigned a : 4;
  _Bool    b : 1;
};

Gets layed out by GCC as an unsigned (4 bytes) of which 4 bits are used, followed by a _Bool (4 bytes) of which 1 bit is used, for a total size of 8 bytes.

Note that C99 and C11 specifically permit _Bool as a bit-field member. The C11 standard (and probably C99 too) also states under §6.7.2.1 'Structure and union specifiers' ¶11 that:

An implementation may allocate any addressable storage unit large enough to hold a bit-field. If enough space remains, a bit-field that immediately follows another bit-field in a structure shall be packed into adjacent bits of the same unit.

So I believe that the member b above should have been packed into the storage unit allocated for the member a, resulting in a struct of total size 4 bytes.

GCC behaves correctly and packing does occur when using the same types for the two members, or when one is unsigned and the other signed, but the types unsigned and _Bool seem to be considered too distinct by GCC for it to handle them correctly.

Can someone confirm my interpretation of the standard, and that this is indeed a GCC bug?

I'm also interested in a work-around (some compiler switch, pragma, __attribute__...).

I'm using gcc 4.7.0 with -std=c11 (although other settings show the same behavior.)

2条回答
孤傲高冷的网名
2楼-- · 2019-04-06 04:57

Using both GCC 4.7.1 (home-built) and GCC 4.2.1 (LLVM/clang†) on Mac OS X 10.7.4 with a 64-bit compilation, this code yields 4 in -std=c99 mode:

#include <stdio.h>

int main(void)
{
    struct S
    {
        unsigned a : 4;
        _Bool    b : 1;
    };
    printf("%zu\n", sizeof(struct S));
    return 0;
}

That's half the size you're reporting on Windows. It seems surprisingly large to me (I would expect it to be size of 1 byte), but the rules of the platform are what they are. Basically, the compiler is not obliged to follow the rules you'd like; it may follow the rules of the platform it is run on, and where it has the chance, it may even define the rules of the platform it is run on.

This following program has mildly dubious behaviour (because it accesses u.i after u.s was last written to), but shows that the field a is stored in the 4 least significant bits and the field b is stored in the next bit:

#include <stdio.h>

int main(void)
{
    union
    {
        struct S
        {
            unsigned a : 4;
            _Bool    b : 1;
        } s;
        int i;
    } u;
    u.i = 0;
    u.s.a = 5;
    u.s.b = 1;
    printf("%zu\n", sizeof(struct S));
    printf("%zu\n", sizeof(u));
    printf("0x%08X\n", u.i);
    u.s.a = 0xC;
    u.s.b = 1;
    printf("0x%08X\n", u.i);
    return 0;
}

Output:

4
4
0x00000015
0x0000001C

† i686-apple-darwin11-llvm-gcc-4.2 (GCC) 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2336.9.00)

查看更多
叛逆
3楼-- · 2019-04-06 05:13

The described behavior is incompatible with the C99 and C11 standards, but is provided for binary compatibility with the MSVC compiler (which has unusual struct packing behavior.)

Fortunately, it can be disabled either in the code with __attribute__((gcc_struct)) applied to the struct, or with the command-line switch -mno-ms-bitfields (see the documentation).

查看更多
登录 后发表回答