Any guarantees for uninitialised variables?

2019-01-19 09:45发布

There are many claims that any use of uninitialised variables invokes undefined behavior (UB).
Perusing the docs, I could not verify that claim, so I would like a convincing argument clarifying this for both C and C++.
I expect the same semantics for both, but am prepared to be surprised by subtle or not so subtle differences.

Some examples of using uninitialised variables to get started. Please add others as needed to explain any corner-cases they don't cover.

void test1() {
    int x;
    printf("%d", x);
}

void test2() {
    int x;
    for(int i = 0; i < CHAR_BIT * sizeof x)
        x = x << 1;
    printf("%d", x);
}

void test3() {
    unsigned x;
    printf("%u", x); /* was format "%d" */
}

void test4() {
    unsigned x;
    for(int i = 0; i < CHAR_BIT * sizeof x)
        x = x << 1;
    printf("%u", x); /* was format "%d" */
}

4条回答
爷、活的狠高调
2楼-- · 2019-01-19 09:58

C

C11 6.7.9/10

If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate.

Indeterminate values are handled as follows:

C11 6.2.6.1/5

Certain object representations need not represent a value of the object type. If the stored value of an object has such a representation and is read by an lvalue expression that does not have character type, the behavior is undefined. If such a representation is produced by a side effect that modifies all or any part of the object by an lvalue expression that does not have character type, the behavior is undefined 50). Such a representation is called a trap representation.

There's a comment to the above normative text:

50) Thus, an automatic variable can be initialized to a trap representation without causing undefined behavior, but the value of the variable cannot be used until a proper value is stored in it.

(emphasis mine)

Furthermore, left-shifting a signed int variable containing an indeterminate value can also lead to undefined behavior in case it is interpreted as a negative one:

C11 6.5.7/4

The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are filled with zeros. If E1 has an unsigned type, the value of the result is E1 × 2E2, reduced modulo one more than the maximum value representable in the result type. If E1 has a signed type and nonnegative value, and E1 × 2E2 is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined.

查看更多
趁早两清
3楼-- · 2019-01-19 10:01

All four cases invoke undefined behavior in C since the uninitialized automatic variable never has its address taken. See different answer.

By the way, sizeof(x) is defined since the expression is not actually evaluated: it's a compile time evaluation that decays to the type.

In the latest C++1y draft(N3936) this is clearly undefined behavior since the language on indeterminate values and undefined behavior has been clarified and it now says in section 8.5:

[...]If an indeterminate value is produced by an evaluation, the behavior is undefined except in the following cases;

and goes on to list exception for some unsigned narrow character types only.

Previously in C++ we had to rely on the underspecified lvalue-to-rvalue conversion to prove undefined behavior, which is problematic in the general case. In this case we do have an lalue-to-rvalue conversion. If we look at section 5.2.2 Function call paragraph 7 which says (emphasis mine):

When there is no parameter for a given argument, the argument is passed in such a way that the receiving function can obtain the value of the argument by invoking va_arg (18.10). [...] The lvalue-to-rvalue (4.1), array-to-pointer (4.2), and function-to-pointer (4.3) standard conversions are performed on the argument expression.

查看更多
在下西门庆
4楼-- · 2019-01-19 10:08

In C all of them are undefined behavior, but for a reason that probably not comes directly to mind. Accessing an object with indeterminate value has undefined behavior if it is "memoryless" that is 6.3.2.1 p2

If the lvalue designates an object of automatic storage duration that could have been declared with the register storage class (never had its address taken), and that object is uninitialized (not declared with an initializer and no assignment to it has been performed prior to use), the behavior is undefined.

Otherwise, if the address is taken, the interpretation of what indeterminate means concretely in this case is not unanimous. There are people that expect such a value to be fixed once it is first read, others speak of something like "woobly" (or so) values that can be different at each access.

In summary, don't do it. (But that you probably knew already.)

(And not talking about the error using "%d" for an unsigned.)

查看更多
一夜七次
5楼-- · 2019-01-19 10:19

With respect to C, the behavior of all the examples is may be undefined:

Chapter and verse

3.19.2
1 indeterminate value
either an unspecified value or a trap representation
...
6.2.6 Representations of types
6.2.6.1 General
...
5 Certain object representations need not represent a value of the object type. If the stored value of an object has such a representation and is read by an lvalue expression that does not have character type, the behavior is undefined. If such a representation is produced by a side effect that modifies all or any part of the object by an lvalue expression that does not have character type, the behavior is undefined.50) Such a representation is called a trap representation.
...
50) Thus, an automatic variable can be initialized to a trap representation without causing undefined behavior, but the value of the variable cannot be used until a proper value is stored in it.

In all four cases, x has automatic storage duration and is not explicitly initialized, meaning its value is indeterminate; if this indeterminate value is a trap representation, then the behavior is undefined.

EDIT

Removed reference to appendix J, as it is non-normative.

查看更多
登录 后发表回答