GOTO before local variable

2019-04-03 04:45发布

问题:

Does the following piece of code constitute undefined behaviour, since I am jumping before the variable declaration and using it via a pointer? If so, are there differences between the standards?

int main() {
  int *p = 0;
label1: 
  if (p) {
    printf("%d\n", *p);
    return 0;
  }
  int i = 999;
  p = &i;
  goto label1;
  return -1;
}

回答1:

There is no undefined behavior in your program.

goto statement has two constraints:

(c11, 6.8.6.1p1) "The identifier in a goto statement shall name a label located somewhere in the enclosing function. A goto statement shall not jump from outside the scope of an identifier having a variably modified type to inside the scope of that identifier."

that you are not violating and there is no other shall requirements outside constraints.

Note that it is the same (in the sense there are no extra requirements) in c99 and c90. Of course in c90, the program would not be valid because of the mix of declaration and statements.

Regarding the lifetime of i object when accessed after the goto statement, C says (see my emphasis, the other copied sentences in the following paragraph would be interesting for a more tricky program):

(c11, 6.2.4p6) "For such an object that does not have a variable length array type, its lifetime extends from entry into the block with which it is associated until execution of that block ends in any way. [...] If the block is entered recursively, a new instance of the object is created each time. [...] If an initialization is specified for the object, it is performed each time the declaration or compound literal is reached in the execution of the block; otherwise, the value becomes indeterminate each time the declaration is reached."

That means, i is still alive when *p is read; no object is accessed outside its lifetime.



回答2:

I'll try to answer the question you may have been trying to ask.

Your program's behavior is well defined. (The return -1; is problematic; only 0, EXIT_SUCCESS and EXIT_FAILURE are well defined as values returned from main. But that's not what you're asking about.)

This program:

#include <stdio.h>
int main(void) {
    goto LABEL;
    int *p = 0;
    LABEL:
    if (p) {
        printf("%d\n", *p);
    }
}

does have undefined behavior. The goto transfers control to a point within the scope of p, but bypasses its initialization, so p has an indeterminate value when the if (p) test is executed.

In your program, the value of p is well defined at all times. The declaration, which is reached before the goto, sets p to 0 (a null pointer). The if (p) test is false, so the body of the if statement is not executed the first time. The goto is executed after p has been given a well defined non-null value. After the goto, the if (p) test is true, and the printf call is executed.

In your program, the lifetime of both p and i begins when the opening { of main is reached, and ends when the closing } is reached or a return statement is executed. The scope of each (i.e., the region of program text in which its name is visible) extends from its declaration to the closing }. When the goto transfers control backwards, the variable name i is out of scope, but the int object to which that name refers still exists. The name p is in scope (because it was declared earlier) and the pointer object still points to the same int object (whose name would be i if that name were visible).

Remember that scope refers to a region of program text in which a name is visible, and lifetime refers to a span of time during program execution during which an object is guaranteed to exist.

Normally, if an object's declaration has an initializer, that guarantees that it has a valid value whenever its name is visible (unless some invalid value is later assigned to it). This can be bypassed with a goto or switch (but not if they're used carefully).



回答3:

This code does not have undefined behavior. We can find a nice example in the Rationale for International Standard—Programming Languages—C in section 6.2.4 Storage durations of objects it says:

[...]There is a simple rule of thumb: the variable declared is created with an unspecified value when the block is entered, but the initializer is evaluated and the value placed in the variable when the declaration is reached in the normal course of execution. Thus a jump forward past a declaration leaves it uninitialized, while a jump backwards will cause it to be initialized more than once. If the declaration does not initialize the variable, it sets it to an unspecified value even if this is not the first time the declaration has been reached.

The scope of a variable starts at its declaration. Therefore, although the variable exists as soon as the block is entered, it cannot be referred to by name until its declaration is reached.

and provides the following example:

int j = 42;
{
   int i = 0;
 loop:
   printf("I = %4d, ", i);
   printf("J1 = %4d, ", ++j);
   int j = i;
   printf("J2 = %4d, ", ++j);
   int k;
   printf("K1 = %4d, ", k);
   k = i * 10;
   printf("K2 = %4d, ", k);
   if (i % 2 == 0) goto skip;
    int m = i * 5;
skip:
  printf("M = %4d\n", m);
  if (++i < 5) goto loop;
}

and the output is:

 I = 0, J1 = 43, J2 = 1, K1 = ????, K2 = 0, M = ????
 I = 1, J1 = 44, J2 = 2, K1 = ????, K2 = 10, M = 5
 I = 2, J1 = 45, J2 = 3, K1 = ????, K2 = 20, M = 5
 I = 3, J1 = 46, J2 = 4, K1 = ????, K2 = 30, M = 15
 I = 4, J1 = 47, J2 = 5, K1 = ????, K2 = 40, M = 15

and it says:

where “????” indicates an indeterminate value (and any use of an indeterminate value is undefined behavior).

This example is consistent with the draft C99 standard section 6.2.4 Storage durations of objects paragraph 5 which says:

For such an object that does not have a variable length array type, its lifetime extends from entry into the block with which it is associated until execution of that block ends in any way. (Entering an enclosed block or calling a function suspends, but does not end, execution of the current block.) If the block is entered recursively, a new instance of the object is created each time. The initial value of the object is indeterminate. If an initialization is specified for the object, it is performed each time the declaration is reached in the execution of the block; otherwise, the value becomes indeterminate each time the declaration is reached.