g++ optimization breaks for loops

A few days ago, I encountered what I believe to be a bug in g++ 5.3 concerning the nesting of for loops at higher -OX optimization levels. (Been experiencing it specifically for -O2 and -O3). The issue is that if you have two nested for loops, that have some internal sum to keep track of total iterations, once this sum exceeds its maximum value it prevents the outer loop from terminating. The smallest code set that I have been able to replicate this with is:

int main(){
    int sum = 0;
    //                 Value of 100 million. (2047483648 less than int32 max.)
    int maxInner = 100000000;

    int maxOuter = 30;

    // 100million * 30 = 3 billion. (Larger than int32 max)

    for(int i = 0; i < maxOuter; ++i)
    {
        for(int j = 0; j < maxInner; ++j)
        {
            ++sum;
        }
        std::cout<<"i = "<<i<<" sum = "<<sum<<std::endl;
    }
}

When this is compiled using g++ -o run.me main.cpp it runs just as expected outputting:

i = 0 sum = 100000000
i = 1 sum = 200000000
i = 2 sum = 300000000
i = 3 sum = 400000000
i = 4 sum = 500000000
i = 5 sum = 600000000
i = 6 sum = 700000000
i = 7 sum = 800000000
i = 8 sum = 900000000
i = 9 sum = 1000000000
i = 10 sum = 1100000000
i = 11 sum = 1200000000
i = 12 sum = 1300000000
i = 13 sum = 1400000000
i = 14 sum = 1500000000
i = 15 sum = 1600000000
i = 16 sum = 1700000000
i = 17 sum = 1800000000
i = 18 sum = 1900000000
i = 19 sum = 2000000000
i = 20 sum = 2100000000
i = 21 sum = -2094967296
i = 22 sum = -1994967296
i = 23 sum = -1894967296
i = 24 sum = -1794967296
i = 25 sum = -1694967296
i = 26 sum = -1594967296
i = 27 sum = -1494967296
i = 28 sum = -1394967296
i = 29 sum = -1294967296

However, when this is compiled using g++ -O2 -o run.me main.cpp, the outer loop fails to terminate. (This only occurs when maxInner * maxOuter > 2^31) While sum continually overflows, it shouldn't in any way affect the other variables. I have also tested this on Ideone.com with the test case demonstrated here: https://ideone.com/5MI5Jb

My question is thus twofold.

How is it possible for the value of sum to in some way effect the system? No decisions are based upon its value, it is merely utilized for the purposes of a counter and the std::cout statement.
What could possibly be causing the dramatically different outcomes at different optimization levels?

Thank you greatly in advance for taking the time to read and consider my question.

Note: This question differs from existing questions such as: Why does integer overflow on x86 with GCC cause an infinite loop? because the issue with that problem was an overflow for the sentinal variable. However, both sentinal variables in this question i and j never exceed the value of 100m let alone 2^31.

标签： c++ for-loop optimization nested-loops

3条回答

三岁会撩人

2楼-- · 2020-03-14 23:28

Answers

As @hvd pointed out, the problem is in your invalid code, not in the compiler.

During your program execution, the sum value overflows int range. Since int is by default signed and overflow of signed values causes undefined behavior* in C, the compiler is free to do anything. As someone noted somewhere, dragons could be flying out of your nose. The result is just undefined.

The difference -O2 causes is in testing the end condition. When the compiler optimizes your loop, it realizes that it can optimize away the inner loop, making it

int sum = 0;
for(int i = 0; i < maxOuter; i++) {
    sum += maxInner;
    std::cout<<"i = "<<i<<" sum = "<<sum<<std::endl;
}

and it may go further, transforming it to

int i = 0;
for(int sum = 0; sum < (maxInner * maxOuter); sum += maxInner) {
    i++;
    std::cout<<"i = "<<i<<" sum = "<<sum<<std::endl;
}

To be honest, I don't really know what it does, the point is, it can do just this. Or anything else, remember the dragons, your program causes undefined behavior.

Suddenly, your sum variable is used in the loop end condition. Note that for defined behavior, these optimizations are perfectly valid. If your sum was unsigned (and your maxInner and maxOuter), the (maxInner * maxOuter) value (which would also be unsigned) would be reached after maxOuter loops, because unsigned operations are defined** to overflow as expected.

Now since we're in the signed domain, the compiler is for one free to assume, that at all times sum < (maxInner * maxOuter), just because the latter overflows, and therefore is not defined. So the optimizing compiler can end up with something like

int i = 0;
for(int sum = 0;/* nothing here evaluates to true */; sum += maxInner) {
    i++;
    std::cout<<"i = "<<i<<" sum = "<<sum<<std::endl;
}

which looks like observed behavior.

*: According to the C11 standard draft, section 6.5 Expressions:

If an exceptional condition occurs during the evaluation of an expression (that is, if the result is not mathematically defined or not in the range of representable values for its type), the behavior is undefined.

**: According to the C11 standard draft, Annex H, H.2.2:

C’s unsigned integer types are ‘‘modulo’’ in the LIA−1 sense in that overflows or out-of-bounds results silently wrap.

I did some research on the topic. I compiled the code above with gcc and g++ (version 5.3.0 on Manjaro) and got some pretty interesting things of it.

Description

To successfully compile it with gcc (C compiler, that is), I have replaced

#include <iostream>
...
std::cout<<"i = "<<i<<" sum = "<<sum<<std::endl;

with

#include <stdio.h>
...
printf("i = %d sum = %d\n", i, sum);

and wrapped this replacement with #ifndef ORIG, so I could have both versions. Then I ran 8 compilations: {gcc,g++} x {-O2, ""} x {-DORIG=1,""}. This yields following results:

Results

gcc, -O2, -DORIG=1: Won't compile, missing <iostream>. Not surprising.
gcc, -O2, "": Produces compiler warning and behaves "normally". A look in the assembly shows that the inner loop is optimized out (j being incremented by 100000000) and the outer loop variable is compared with hardcoded value -1294967296. So, GCC can detect this and do some clever things while the program is working expectably. More importantly, warning is emitted to warn user about undefined behavior.
gcc, "", -DORIG=1: Won't compile, missing <iostream>. Not surprising.
gcc, "", "": Compiles without warning. No optimizations, program runs as expected.
g++, -O2, -DORIG=1: Compiles without warning, runs in endless loop. This is OP's original code running. C++ assembly is tough to follow for me. Addition of 100000000 is there though.
g++, -O2, "": Compiles with warning. It is enough to change how the output is printed to change compiler warning emiting. Runs "normally". By the assembly, AFAIK the inner loop gets optimized out. At least there is again comparison against -1294967296 and incrementation by 100000000.
g++, "", -DORIG=1: Compiles without warning. No optimization, runs "normally".
g++, "", "": dtto

The most interesting part for me was to find out the difference upon change of printing. Actually from all the combinations, only the one used by OP produces endless-loop program, the others fail to compile, do not optimize or optimize with warning and preserve sanity.

Code

Follows example build command and my full code

$ gcc -x c -Wall -Wextra -O2 -DORIG=1 -o gcc_opt_orig  main.cpp

main.cpp:

#ifdef ORIG
#include <iostream>
#else
#include <stdio.h>
#endif

int main(){
    int sum = 0;
    //                 Value of 100 million. (2047483648 less than int32 max.)
    int maxInner = 100000000;

    int maxOuter = 30;

    // 100million * 30 = 3 billion. (Larger than int32 max)

    for(int i = 0; i < maxOuter; ++i)
    {
        for(int j = 0; j < maxInner; ++j)
        {
            ++sum;
        }
#ifdef ORIG
        std::cout<<"i = "<<i<<" sum = "<<sum<<std::endl;
#else
        printf("i = %d sum = %d\n", i, sum);
#endif
    }
}

0人赞添加讨论(0) 举报

戒情不戒烟

3楼-- · 2020-03-14 23:33

This is an optimisation that's perfectly valid for correct code. Your code isn't correct.

What GCC sees is that the only way the loop exit condition i >= maxOuter could ever be reached is if you have signed integer overflow during earlier loop iterations in your calculation of sum. The compiler assumes there isn't signed integer overflow, because signed integer overflow isn't allowed in standard C. Therefore, i < maxOuter can be optimised to just true.

This is controlled by the -faggressive-loop-optimizations flag. You should be able to get the behaviour you expect by adding -fno-aggressive-loop-optimizations to your command line arguments. But better would be making sure your code is valid. Use unsigned integer types to get guaranteed valid wraparound behaviour.

0人赞添加讨论(0) 举报

神经病院院长

4楼-- · 2020-03-14 23:49

Your code invokes undefined behaviour, since the int sum overflows. You say "this shouldn't in any way affect the other variables". Wrong. Once you have undefined behaviour, all odds are off. Anything can happen.

gcc is (in)famous for optimisations that assume there is no undefined behaviour and do let's say interesting things if undefined behaviour happens.

Solution: Don't do it.

0人赞添加讨论(0) 举报