Understanding the C11 type hierarchy

2019-01-31 15:37发布

问题:

I would like to fully understand type hierarchy of the C11 language and present it graphically (a tree diagram would be perfect). The standard does not provide any figure for this issue – there are 30 points describing individual types and relations between them. I'd like to draw it.

My attempt started from obtaining the ISO/IEC 9899:201x Committee Draft N1570 and extracting all the essential statements from section 6.2.5 of the document. Then, I started to rearrange the knowledge in a form of a tree. Let me present my work in two steps.

Step 1: points 1–15

The extracted knowledge (point within section 6.2.5 + specified production):

  • 1 types = object types + function types;
  • 4 standard signed integer types = signed char, short int, int, long int, long long int;
  • 4 signed integer types = standard signed integer types + extended signed integer types;
  • 6 standard unsigned integer types = _Bool, unsigned char, unsigned short int, unsigned int, unsigned long int, unsigned long long int;
  • 6 unsigned integer types = standard unsigned integer types + extended unsigned integer types;
  • 7 standard integer types = standard signed integer types + standard unsigned integer types;
  • 7 extended integer types = extended signed integer types + extended unsigned integer types;
  • 10 real floating types = float, double, long double;
  • 11 complex types = float _Complex, double _Complex, long double _Complex;
  • 12 floating types = real floating types + complex types;
  • 14 basic types = char + signed integer types + unsigned integer types + floating types;
  • 15 character types = char, signed char, unsigned char.

And the resulting structure:

types
    object types
    function types
basic types
    char
    sίgned integer types
        standard sίgned integer types
            signed char, short int, int, long int, long long int
        extended sίgned integer types
    unsίgned integer types
        standard unsίgned integer types
            _Bool, unsigned char, unsigned short int, unsigned int,
            unsigned long int, unsigned long long int
        extended unsίgned integer types
    floating types
        real floating types
            float, double, long double
        complex types
            float _Complex, double _Complex, long double _Complex
standard integer types
    standard sίgned integer types
    standard unsίgned integer types
extended integer types
    extended sίgned integer types
    extended unsίgned integer types
character types
    char, signed char, unsigned char

Step 2: points 16–24

The remaining statements:

  • 16 enumerated types;
  • 17 integer types = char + signed integer types + unsigned integer types + enumerated types;
  • 17 real types = integer types + real floating types;
  • 18 arithmetic types = integer types + floating types;
  • 20 derived types = array types, structure types, union types, function types, pointer types, atomic types;
  • 21 scalar types = arithmetic types + pointer types;
  • 21 aggregate types = array types + structure types;
  • 24 derived declarator types = array types + function types + pointer types.

And the final C11 type system structure:

types
    object types
    function types
basic types
    char
    sίgned integer types
        standard sίgned integer types
            signed char, short int, int, long int, long long int
        extended sίgned integer types
    unsίgned integer types
        standard unsίgned integer types
            _Bool, unsigned char, unsigned short int, unsigned int,
            unsigned long int, unsigned long long int
        extended unsίgned integer types
    floating types
        real floating types
            float, double, long double
        complex types
            float _Complex, double _Complex, long double _Complex
standard integer types
    standard sίgned integer types
    standard unsίgned integer types
extended integer types
    extended sίgned integer types
    extended unsίgned integer types
character types
    char, signed char, unsigned char
real types
    integer types
        char
        sίgned integer types
            standard sίgned integer types
                signed char, short int, int, long int, long long int
            extended sίgned integer types
        unsίgned integer types
            standard unsίgned integer types
                _Bool, unsigned char, unsigned short int, unsigned int,
                unsigned long int, unsigned long long int
            extended unsίgned integer types
        enumeration  types
    real floating types
        float, double, long double
scalar types
    arithmetic types
        integer types
            char
            sίgned integer types
                standard sίgned integer types
                    signed char, short int, int, long int, long long int
                extended sίgned integer types
            unsίgned integer types
                standard unsίgned integer types
                    _Bool, unsigned char, unsigned short int, unsigned int,
                    unsigned long int, unsigned long long int
                extended unsίgned integer types
            enumeration  types
        floating types
            real floating types
                float, double, long double
            complex types
                float _Complex, double _Complex, long double _Complex
    pointer types
derived types
    array types
    structure types
    unίon types
    function types
    pointer types
    atomic types
aggregate types
    array type
    structure type
derived declarator types
    array type
    structure type
    pointer type

Now I need to reduce the structure (ideally to a single tree) or find a more tricky way to represent the relations. I would like to came out with a nice cheet-sheet for the C11 typing system. Any ideas?

回答1:

The cluttered structure of C11 types resulting from the second step of the question can be simplified by removal/reduction of less important nodes and delegating some redundant/subsidiary information to be presented by other means.

I propose the following five-step algorithm for that:

  1. Removal of all extended integer types (strictly conforming implementation assumed);
  2. Reduction of the standard integer types (as they do not partition types any more);
  3. Grouping the structure:
    1. A scalar types vs aggregate types pair of sub-trees (represented as a tree),
    2. A basic types vs derived types pair of sub-trees (represented by coloured regions),
    3. real types and derived declarator types (represented as stroked sub-regions of these),
    4. character types (represented with different text colour);
  4. Application of an off-standard production: object types = scalar types + aggregate types;
  5. Supplementing the object types of missing union types and atomic types.

The resulting C11 type system summary looks as follows:

The grey stroke/areas are introduced to increase readability of the tree.

The type summary does not include the concept of "type declaration completeness" because it is a state, observed at a particular point within a translation unit. At run-time, all objects and functions are instances of a complete type. The void type is an exception but, as a no-type (or any-type in case of a pointer), it is intentionally excluded from the diagram.

The const, volatile, restrict and _Atomic are type qualifiers which, contrary to type specifiers for the derived types, cannot be applied recursively. Any combination of these may prepend any type definition (as long as it makes sense). Thus, including them in the diagram would complicate it, while not introducing any suitable information. The apparent exception makes the _Atomic (type) construct, which is taken into account as being a type specifier for the atomic type – one of the derived types listed in the C11 standard.