“this” captured by lambda is incorrect. GCC compil

2020-02-27 05:46发布

问题:

For the last few days, I have been debugging a weird issue involving lambdas in C++. I have reduced the problem down to the following symptoms:

  • The this pointer gets corrupted inside a lambda (note: this is always captured by copy, so the lambda should get its own this pointer, which points to the App object)
  • It only occurs if a std::cout print statement is present, and called before the lambda is created. The print statement can be seemingly completely unrelated (e.g. print "Hello!"). printf() also exhibits the same behaviour.
  • It only occurs when cross-compiling.
  • It compiles and runs fine with the standard compiler for x86 architecture (see example).
  • If I create the lambda on the heap (and save a pointer to it inside the App object), the bug does not occur.
  • The bug does not occur if optimizations are turned off (i.e. if I set the -O0 flag). It occurs when optimization is set to -O2.

The following is the simplest, compilable code example I could come up with that causes the problem.

#include <iostream>
#include <functional>

class App {

public:

    std::function<void*()> test_;

    void Run() {

        // Enable this line, ERROR is printed
        // Disable this line, app runs o.k.
        std::cout << "This print statement causes the bug below!" << std::endl;

        test_ = [this] () {
            return this;
        };

        void* returnedThis = test_();
        if(returnedThis != this) {
            std::cout << "ERROR: 'this' returned from lambda (" << returnedThis 
                      << ") is NOT the same as 'this' (" << this << ") !?!?!?!?!"
                      << std::endl;
        } else {
            std::cout << "Program run successfully." << std::endl;
        }

    }
};

int main(void) {
    App app;
    app.Run();
}

When running on the target device, I get the following output:

This print statement causes the bug below!
ERROR: 'this' returned from lambda (0xbec92dd4) is NOT the same as 'this' 
(0xbec92c68) !?!?!?!?!

If I try and dereference the corrupted this, I usually get a segmentation fault, which is how I discovered the bug in the first place.

Compiler Settings

arm-poky-linux-gnueabi-g++ -march=armv7-a -marm -mfpu=neon -std=c++14 \
-mfloat-abi=hard -mcpu=cortex-a9 \
--sysroot=/home/ghunter/sysroots/cortexa9hf-neon-poky-linux-gnueabi \
-O2 -pipe -g -feliminate-unused-debug-types

Linker Settings

arm-poky-linux-gnueabi-ld \
--sysroot=/home/ghunter/sysroots/cortexa9hf-neon-poky-linux-gnueabi \
-Wl,-O1 -Wl,--hash-style=gnu -Wl,--as-needed

Compiler Version

~$ arm-poky-linux-gnueabi-g++ --version

arm-poky-linux-gnueabi-g++ (GCC) 6.2.0
Copyright (C) 2016 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Could this be a compiler bug?

回答1:

This seems to be a compiler bug in gcc 6.2, see:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77686

Workarounds:

  • Use -fno-schedule-insns2 flag (as pointed out by gbmhunter, see comment below).
  • Do not use -O2 optimizations or higher.


回答2:

Sounds like the following compiler bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77933 (which only effects code generated with O1 optimizations or higher).