Flexible array members can lead to undefined behav

2019-03-09 20:59发布

站内文章 / C

42 0

来，给爷笑一个

女 | 书童

私信

可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

By using flexible array members (FAMs) within structure types, are we exposing our programs to the possibility of undefined behavior?
Is it possible for a program to use FAMs and still be a strictly conforming program?
Is the offset of the flexible array member required to be at the end of the struct?

The questions apply to both C99 (TC3) and C11 (TC1).

#include <stdio.h>
#include <stdlib.h>
#include <stddef.h>

int main(void) {
    struct s {
        size_t len;
        char pad;
        int array[];
    };

    struct s *s = malloc(sizeof *s + sizeof *s->array);

    printf("sizeof *s: %zu\n", sizeof *s);
    printf("offsetof(struct s, array): %zu\n", offsetof(struct s, array));

    s->array[0] = 0;
    s->len = 1;

    printf("%d\n", s->array[0]);

    free(s);
    return 0;
}

Output:

sizeof *s: 16
offsetof(struct s, array): 12
0

回答1:

The Short Answer

Yes. Common conventions of using FAMs expose our programs to the possibility of undefined behavior. Having said that, I'm unaware of any existing conforming implementation that would misbehave.
Possible, but unlikely. Even if we don't actually reach undefined behavior, we are still likely to fail strict conformance.
No. The offset of the FAM is not required to be at the end of the struct, it may overlay any trailing padding bytes.

The answers apply to both C99 (TC3) and C11 (TC1).

The Long Answer

FAMs were first introduced in C99 (TC0) (Dec 1999), and their original specification required the offset of the FAM to be at the end of the struct. The original specification was well-defined and as such couldn't lead to undefined behavior or be an issue with regards to strict conformance.

C99 (TC0) §6.7.2.1 p16 (Dec 1999)

[This document is the official standard, it is copyrighted and not freely available]

The problem was that common C99 implementations, such as GCC, didn't follow the requirement of the standard, and allowed the FAM to overlay any trailing padding bytes. Their approach was considered to be more efficient, and since for them to follow the requirement of the standard- would result with breaking backwards compatibility, the committee chose to change the specification, and as of C99 TC2 (Nov 2004) the standard no longer required the offset of the FAM to be at the end of the struct.

C99 (TC2) §6.7.2.1 p16 (Nov 2004)

[...] the size of the structure is as if the flexible array member were omitted except that it may have more trailing padding than the omission would imply.

The new specification removed the statement that required the offset of the FAM to be at the end of the struct, and it introduced a very unfortunate consequence, because the standard gives the implementation the liberty not to keep the values of any padding bytes within structures or unions in a consistent state. More specifically:

C99 (TC3) §6.2.6.1 p6

When a value is stored in an object of structure or union type, including in a member object, the bytes of the object representation that correspond to any padding bytes take unspecified values.

This means that if any of our FAM elements correspond to (or overlay) any trailing padding bytes, upon storing to a member of the struct- they (may) take unspecified values. We don't even need to ponder whether this applies to a value stored to the FAM itself, even the strict interpretation that this only applies to members other than the FAM, is damaging enough.

#include <stdio.h>
#include <stdlib.h>
#include <stddef.h>

int main(void) {
    struct s {
        size_t len;
        char pad;
        int array[];
    };

    struct s *s = malloc(sizeof *s + sizeof *s->array);

    if (sizeof *s > offsetof(struct s, array)) {
        s->array[0] = 123;
        s->len = 1; /* any padding bytes take unspecified values */

        printf("%d\n", s->array[0]); /* indeterminate value */
    }

    free(s);
    return 0;
}

Once we store to a member of the struct, the padding bytes take unspecified bytes, and therefore any assumption made about the values of the FAM elements that correspond to any trailing padding bytes, is now false. Which means that any assumption leads to us failing strict conformance.

Undefined behavior

Although the values of the padding bytes are "unspecified values", the same can't be said about the type being affected by them, because an object representation which is based on unspecified values can generate a trap representation. So the only standard term which describes these two possibilities would be "indeterminate value". If the type of the FAM happens to have trap representations, then accessing it is not just a concern of an unspecified value, but undefined behavior.

But wait, there's more. If we agree that the only standard term to describe such value is as being an "indeterminate value", then even if the type of the FAM happens not to have trap representations, we've reached undefined behavior, since the official interpretation of the C standards committee is that passing indeterminate values to standard library functions is undefined behavior.

回答2:

^{This is a long answer dealing extensively with a tricky topic.}

TL;DR

I disagree with the analysis by Dror K.

The key problem is a misunderstanding of what the §6.2.1 ¶6 in the C99 and C11 standards means, and inappropriately applying it to a simple integer assignment such as:

fam_ptr->nonfam_member = 23;

This assignment is not allowed to change any padding bytes in the structure pointed at by fam_ptr. Consequently, the analysis based on the presumption that this can change padding bytes in the structure is erroneous.

Background

In principle, I'm not dreadfully concerned about the C99 standard and its corrigenda; they're not the current standard. However, the evolution of the specification of flexible array members is informative.

The C99 standard — ISO/IEC 9899:1999 — had 3 technical corrigenda:

TC1 released on 2001-09-01 (7 pages),
TC2 released on 2004-11-15 (15 pages),
TC3 released on 2007-11-15 (10 pages).

It was TC3, for example, that stated that gets() was obsolescent and deprecated, leading to it being removed from the C11 standard.

The C11 standard — ISO/IEC 9899:2011 — has one technical corrigendum, but that simply sets the value of two macros accidentally left in the form 201ymmL — the values required for __STDC_VERSION__ and __STDC_LIB_EXT1__ were corrected to the value 201112L. (You can see the TC1 — formally "ISO/IEC 9899:2011/Cor.1:2012(en) Information technology — Programming languages — C TECHNICAL CORRIGENDUM 1" — at https://www.iso.org/obp/ui/#iso:std:iso-iec:9899:ed-3:v1:cor:1:v1:en. I've not worked out how you get a download of it, but it is so simple that it really doesn't matter very much.

C99 standard on flexible array members

ISO/IEC 9899:1999 (before TC2) §6.7.2.1 ¶16:

As a special case, the last element of a structure with more than one named member may have an incomplete array type; this is called a flexible array member. With two exceptions, the flexible array member is ignored. First, the size of the structure shall be equal to the offset of the last element of an otherwise identical structure that replaces the flexible array member with an array of unspecified length.¹⁰⁶⁾ Second, when a . (or ->) operator has a left operand that is (a pointer to) a structure with a flexible array member and the right operand names that member, it behaves as if that member were replaced with the longest array (with the same element type) that would not make the structure larger than the object being accessed; the offset of the array shall remain that of the flexible array member, even if this would differ from that of the replacement array. If this array would have no elements, it behaves as if it had one element but the behavior is undefined if any attempt is made to access that element or to generate a pointer one past it.

¹²⁶⁾ The length is unspecified to allow for the fact that implementations may give array members different alignments according to their lengths.

(This footnote is removed in the rewrite.) The original C99 standard included one example:

¶17 EXAMPLE Assuming that all array members are aligned the same, after the declarations:
struct s { int n; double d[]; };
struct ss { int n; double d[1]; };
the three expressions:
sizeof (struct s)
offsetof(struct s, d)
offsetof(struct ss, d)
have the same value. The structure struct s has a flexible array member d.

¶18 If sizeof (double) is 8, then after the following code is executed:
struct s *s1;
struct s *s2;
s1 = malloc(sizeof (struct s) + 64);
s2 = malloc(sizeof (struct s) + 46);
and assuming that the calls to malloc succeed, the objects pointed to by s1 and s2 behave as if the identifiers had been declared as:
struct { int n; double d[8]; } *s1;
struct { int n; double d[5]; } *s2;
¶19 Following the further successful assignments:
s1 = malloc(sizeof (struct s) + 10);
s2 = malloc(sizeof (struct s) + 6);
they then behave as if the declarations were:
struct { int n; double d[1]; } *s1, *s2;
and:
double *dp;
dp = &(s1->d[0]); // valid
*dp = 42;         // valid
dp = &(s2->d[0]); // valid
*dp = 42;         // undefined behavior
¶20 The assignment:
*s1 = *s2;
only copies the member n and not any of the array elements. Similarly:
struct s t1 = { 0 };          // valid
struct s t2 = { 2 };          // valid
struct ss tt = { 1, { 4.2 }}; // valid
struct s t3 = { 1, { 4.2 }};  // invalid: there is nothing for the 4.2 to initialize
t1.n = 4;                     // valid
t1.d[0] = 4.2;                // undefined behavior

Some of this example material was removed in C11. The change was not noted (and did not need to be noted) in TC2 because the examples are not normative. But the rewritten material in C11 is informative when studied.

N983 paper identifying a problem with flexible array members

N983 from the WG14 Pre-Santa Cruz-2002 mailing is, I believe, the initial statement of a defect report. It states that some C compilers (citing three) manage to put a FAM before the padding at the end of a structure. The final defect report was DR 282.

As I understand it, this report led to the change in TC2, though I have not traced all the steps in the process. It appears that the DR is no longer available separately.

TC2 used the wording found in the C11 standard in the normative material.

C11 standard on flexible array members

So, what does the C11 standard have to say about flexible array members?

§6.7.2.1 Structure and union specifiers

¶3 A structure or union shall not contain a member with incomplete or function type (hence, a structure shall not contain an instance of itself, but may contain a pointer to an instance of itself), except that the last member of a structure with more than one named member may have incomplete array type; such a structure (and any union containing, possibly recursively, a member that is such a structure) shall not be a member of a structure or an element of an array.

This firmly positions the FAM at the end of the structure — 'the last member' is by definition at the end of the structure, and this is confirmed by:

¶15 Within a structure object, the non-bit-field members and the units in which bit-fields reside have addresses that increase in the order in which they are declared.

¶17 There may be unnamed padding at the end of a structure or union.

¶18 As a special case, the last element of a structure with more than one named member may have an incomplete array type; this is called a flexible array member. In most situations, the flexible array member is ignored. In particular, the size of the structure is as if the flexible array member were omitted except that it may have more trailing padding than the omission would imply. However, when a . (or ->) operator has a left operand that is (a pointer to) a structure with a flexible array member and the right operand names that member, it behaves as if that member were replaced with the longest array (with the same element type) that would not make the structure larger than the object being accessed; the offset of the array shall remain that of the flexible array member, even if this would differ from that of the replacement array. If this array would have no elements, it behaves as if it had one element but the behavior is undefined if any attempt is made to access that element or to generate a pointer one past it.

This paragraph contains the change in ¶20 of ISO/IEC 9899:1999/Cor.2:2004(E) — the TC2 for C99;

The data at the end of the main part of a structure containing a flexible array member is regular trailing padding that can occur with any structure type. Such padding can't be accessed legitimately, but can be passed to library functions etc via pointers to the structure without incurring undefined behaviour.

The C11 standard contains three examples, but the first and third are related to anonymous structures and unions rather than the mechanics of flexible array members. Remember, examples are not 'normative', but they are illustrative.

¶20 EXAMPLE 2 After the declaration:
struct s { int n; double d[]; };
the structure struct s has a flexible array member d. A typical way to use this is:
int m = /* some value */;
struct s *p = malloc(sizeof (struct s) + sizeof (double [m]));
and assuming that the call to malloc succeeds, the object pointed to by p behaves, for most purposes, as if p had been declared as:
struct { int n; double d[m]; } *p;
(there are circumstances in which this equivalence is broken; in particular, the offsets of member d might not be the same).

¶21 Following the above declaration:
struct s t1 = { 0 };          // valid
struct s t2 = { 1, { 4.2 }};  // invalid
t1.n = 4;                     // valid
t1.d[0] = 4.2;                // might be undefined behavior
The initialization of t2 is invalid (and violates a constraint) because struct s is treated as if it did not contain member d. The assignment to t1.d[0] is probably undefined behavior, but it is possible that
sizeof (struct s) >= offsetof(struct s, d) + sizeof (double)
in which case the assignment would be legitimate. Nevertheless, it cannot appear in strictly conforming code.

¶22 After the further declaration:
struct ss { int n; };
the expressions:
sizeof (struct s) >= sizeof (struct ss)
sizeof (struct s) >= offsetof(struct s, d)
are always equal to 1.

¶23 If sizeof (double) is 8, then after the following code is executed:
struct s *s1;
struct s *s2;
s1 = malloc(sizeof (struct s) + 64);
s2 = malloc(sizeof (struct s) + 46);
and assuming that the calls to malloc succeed, the objects pointed to by s1 and s2 behave, for most purposes, as if the identifiers had been declared as:
struct { int n; double d[8]; } *s1;
struct { int n; double d[5]; } *s2;
¶24 Following the further successful assignments:
s1 = malloc(sizeof (struct s) + 10);
s2 = malloc(sizeof (struct s) + 6);
they then behave as if the declarations were:
struct { int n; double d[1]; } *s1, *s2;
and:
double *dp;
dp = &(s1->d[0]); // valid
*dp = 42;         // valid
dp = &(s2->d[0]); // valid
*dp = 42;         // undefined behavior
¶25 The assignment:
*s1 = *s2;
only copies the member n; if any of the array elements are within the first sizeof (struct s) bytes of the structure, they might be copied or simply overwritten with indeterminate values.

Note that this changed between C99 and C11.

Another part of the standard describes this copying behaviour:

§6.2.6 Representation of types §6.2.6.1 General

¶6 When a value is stored in an object of structure or union type, including in a member object, the bytes of the object representation that correspond to any padding bytes take unspecified values.⁵¹⁾ The value of a structure or union object is never a trap representation, even though the value of a member of the structure or union object may be a trap representation.

⁵¹⁾ Thus, for example, structure assignment need not copy any padding bits.

Illustrating the problematic FAM structure

In the C chat room, I wrote some information of which this is a paraphrase:

Consider:

struct fam1 { double d; char c; char fam[]; };

Assuming double requires 8-byte alignment (or 4-byte; it doesn't matter too much, but I'll stick with 8), then struct non_fam1a { double d; char c; }; would have 7 padding bytes after c and a size of 16. Further, struct non_fam1b { double d; char c; char nonfam[4]; }; would have 3 bytes padding after the nonfam array, and a size of 16.

The suggestion is that the start of fam in struct fam1 can be at offset 9, even though sizeof(struct fam1) is 16. So that the bytes after c are not padding (necessarily).

So, for a small enough FAM, the size of the struct plus FAM might still be less than size of struct fam.

The prototypical allocation is:

struct fam1 *fam = malloc(sizeof(struct fam1) + array_size * sizeof(char));

when the FAM is of type char (as in struct fam1). That's a (gross) over-estimate when the offset of fam is less than sizeof(struct fam1).

Dror K. pointed out:

There are macros out there for calculating the 'precise' required storage based on FAM offsets that are less than the size of the structure. Such as this one: https://gustedt.wordpress.com/2011/03/14/flexible-array-member/

Addressing the question

The question asks:

By using flexible array members (FAMs) within structure types, are we exposing our programs to the possibility of undefined behavior?

Is it possible for a program to use FAMs and still be a strictly conforming program?

Is the offset of the flexible array member required to be at the end of the struct?

The questions apply to both C99 (TC3) and C11 (TC1).

I believe that if you code correctly, the answers are "No", "Yes", "No and Yes, depending …".

Question 1

I am assuming that the intent of question 1 is "must your program inevitably be exposed to undefined behaviour if you use any FAM anywhere?" To state what I think is obvious: there are lots of ways of exposing a program to undefined behaviour (and some of those ways involve structures with flexible array members).

I do not think that simply using a FAM means that the program automatically has (invokes, is exposed to) undefined behaviour.

Question 2

Section §4 Conformance defines:

¶5 A strictly conforming program shall use only those features of the language and library specified in this International Standard.³⁾ It shall not produce output dependent on any unspecified, undefined, or implementation-defined behavior, and shall not exceed any minimum implementation limit.

³⁾ A strictly conforming program can use conditional features (see 6.10.8.3) provided the use is guarded by an appropriate conditional inclusion preprocessing directive using the related macro. …

¶7 A conforming program is one that is acceptable to a conforming implementation.⁵⁾.

⁵⁾ Strictly conforming programs are intended to be maximally portable among conforming implementations. Conforming programs may depend upon nonportable features of a conforming implementation.

I don't think there are any features of standard C which, if used in the way that the standard intends, makes the program not strictly conforming. If there are any such, they are related to locale-dependent behaviour. The behaviour of FAM code is not inherently locale-dependent.

I do not think that the use of a FAM inherently means that the program is not strictly conforming.

Question 3

I think question 3 is ambiguous between:

3A: Is the offset of the flexible array member required to be equal to the size of the structure containing the flexible array member?
3B: Is the offset of the flexible array member required to be greater than the offset of any prior member of the structure?

The answer to 3A is "No" (witness the C11 example at ¶25, quoted above).

The answer to 3B is "Yes" (witness §6.7.2.1 ¶15, quoted above).

Dissenting from Dror's answer

I need to quote the C standard and Dror's answer. I'll use [DK] to indicate the start of a quote from Dror's answer, and unmarked quotations are from the C standard.

As of 2017-07-01 18:00 -08:00, the short answer by Dror K said:

[DK]

Yes. Common conventions of using FAMs expose our programs to the possibility of undefined behavior. Having said that, I'm unaware of any existing conforming implementation that would misbehave.

I'm not convinced that simply using a FAM means that the program automatically has undefined behaviour.

[DK]

Possible, but unlikely. Even if we don't actually reach undefined behavior, we are still likely to fail strict conformance.

I'm not convinced that the use of a FAM automatically renders a program not strictly conforming.

[DK]

No. The offset of the FAM is not required to be at the end of the struct, it may overlay any trailing padding bytes.

This is the answer to my interpretation 3A, and I agree with this.

The long answer contains interpretation of the short answers above.

[DK]

The problem was that common C99 implementations, such as GCC, didn't follow the requirement of the standard, and allowed the FAM to overlay any trailing padding bytes. Their approach was considered to be more efficient, and since for them to follow the requirement of the standard- would result with breaking backwards compatibility, the committee chose to change the specification, and as of C99 TC2 (Nov 2004) the standard no longer required the offset of the FAM to be at the end of the struct.

I agree with this analysis.

[DK]

The new specification removed the statement that required the offset of the FAM to be at the end of the struct, and it introduced a very unfortunate consequence, because the standard gives the implementation the liberty not to keep the values of any padding bytes within structures or unions in a consistent state.

I agree that the new specification removed the requirement that the FAM be stored at an offset greater than or equal to the size of the structure.

I don't agree that there is a problem with the padding bytes.

The standard explicitly says that structure assignment for a structure containing a FAM effectively ignores the FAM (§6.7.2.1 ¶18). It must copy the non-FAM members. It is explicitly stated that padding bytes need not be copied at all (§6.2.6.1 ¶6 and footnote 51). And the Example 2 explicitly states (non-normatively §6.7.2.1 ¶25) that if the FAM overlaps the space defined by the structure, the data from the part of the FAM that overlaps with the end of the structure might or might not be copied.

[DK]

This means that if any of our FAM elements correspond to (or overlay) any trailing padding bytes, upon storing to a member of the struct- they (may) take unspecified values. We don't even need to ponder whether this applies to a value stored to the FAM itself, even the strict interpretation that this only applies to members other than the FAM, is damaging enough.

I don't see this as a problem. Any expectation that you can copy a structure containing a FAM using structure assignment and have the FAM array copied is is inherently flawed — the copy leaves the FAM data logically uncopied. Any program that depends on the FAM data within the scope of the structure is broken; that is a property of the (flawed) program, not the standard.

[DK]

#include <stdio.h>
#include <stdlib.h>
#include <stddef.h>

int main(void) {
    struct s {
        size_t len;
        char pad;
        int array[];
    };

    struct s *s = malloc(sizeof *s + sizeof *s->array);

    if (sizeof *s > offsetof(struct s, array)) {
        s->array[0] = 123;
        s->len = 1; /* any padding bytes take unspecified values */

        printf("%d\n", s->array[0]); /* indeterminate value */
    }

    free(s);
    return 0;
}

Ideally, of course, the code would set the named member pad to a determinate value, but that doesn't cause actually cause a problem since it is never accessed.

I emphatically disagree that the value of s->array[0] in the printf() is indeterminate; its value is 123.

The prior standard quote is (it is the same §6.2.6.1 ¶6 in both C99 and C11, though the footnote number is 42 in C99 and 51 in C11):

When a value is stored in an object of structure or union type, including in a member object, the bytes of the object representation that correspond to any padding bytes take unspecified values.

Note that s->len is not an assignment to an object of structure or union type; it is an assignment to an object of type size_t. I think this may be the main source of confusion here.

If the code included:

struct s *t = malloc(sizeof(*t) + sizeof(t->array[0]));
*t = *s;
printf("t->array[0] = %d\n", t->array[0]);

then the value printed is indeed indeterminate. However, that's because copying a structure with a FAM is not guaranteed to copy the FAM. More nearly correct code would be (assuming you add #include <string.h>, of course):

struct s *t = malloc(sizeof(*t) + sizeof(t->array[0]));
*t = *s;
memmmove(t->array, s->array, sizeof(t->array[0]));
printf("t->array[0] = %d\n", t->array[0]);

Now the value printed is determinate (it is 123). Note that the condition on if (sizeof *s > offsetof(struct s, array)) is immaterial to my analysis.

Since the rest of the long answer (mainly the section identified by the heading 'undefined behavior') is based on a false inference about the possibility of the padding bytes of a structure changing when assigning to an integer member of a structure, the rest of the discussion does not need further analysis.

[DK]

Once we store to a member of the struct, the padding bytes take unspecified bytes, and therefore any assumption made about the values of the FAM elements that correspond to any trailing padding bytes, is now false. Which means that any assumption leads to us failing strict conformance.

This is based on a false premise; the conclusion is false.

回答3:

If one allows for a strictly conforming program to make use of Implementation-Defined behavior in cases where it would "work" with all legitimate behaviors (notwithstanding that almost any kind of useful output would depend upon Implementation-Defined details such as the execution character set), use of flexible array members within a strictly-conforming program should be possible provided that the program does not care whether the offset of the flexible array member coincides with the length of the structure.

Arrays are not regarded as having any padding internally, so any padding that gets added because of the FAM would precede it. If there is enough space either within or beyond a structure to accommodate members in an FAM, those members are part of the FAM. For example, given:

struct { long long x; char y; short z[]; } foo;

the size of "foo" may be padded out beyond the start of z because of alignment, but any such padding will be usable as part of z. Writing y might disturb padding that precedes z, but should not disturb any part of z itself.