Running C++ code outside of functions scope

2019-03-24 11:48发布

问题:

(I know) In c++ I can declare variable out of scope and I can't run any code/statement, except for initializing global/static variables.


IDEA

Is it a good idea to use below tricky code in order to (for example) do some std::map manipulation ?

Here I use void *fakeVar and initialize it through Fake::initializer() and do whatever I want in it !

std::map<std::string, int> myMap;

class Fake
{
public:
    static void* initializer()
    {
        myMap["test"]=222;
        // Do whatever with your global Variables

        return NULL;
    }
};

// myMap["Error"] = 111;                  => Error
// Fake::initializer();                   => Error
void *fakeVar = Fake::initializer();    //=> OK

void main()
{
    std::cout<<"Map size: " << myMap.size() << std::endl; // Show myMap has initialized correctly :)
}

回答1:

One way of solving it is to have a class with a constructor that does things, then declare a dummy variable of that class. Like

struct Initializer
{
    Initializer()
    {
        // Do pre-main initialization here
    }
};

Initializer initializer;

You can of course have multiple such classes doing miscellaneous initialization. The order in each translation unit is specified to be top-down, but the order between translation units is not specified.



回答2:

You don't need a fake class... you can initialize using a lambda

auto myMap = []{
    std::map<int, string> m;
    m["test"] = 222;
    return m;
}();

Or, if it's just plain data, initialize the map:

std::map<std::string, int> myMap { { "test", 222 } };


回答3:

Is it a good idea to use below tricky code in order to (for example) do some std::map manipulation ?

No.

Any solution entailing mutable non-local variables is a terrible idea.



回答4:

Is it a good idea...?

Not really. What if someone decides that in their "tricky initialisation" they want to use your map, but on some system or other, or for not obvious reason after a particular relink, your map ends up being initialised after their attempted use? If you instead have them call a static function that returns a reference to the map, then it can initialise it on first call. Make the map a static local variable inside that function and you stop any accidental use without this protection.



回答5:

§ 8.5.2 states

Except for objects declared with the constexpr specifier, for which see 7.1.5, an initializer in the definition of a variable can consist of arbitrary expressions involving literals and previously declared variables and functions, regardless of the variable’s storage duration

therefore what you're doing is perfectly allowed by the C++ standard. That said, if you need to perform "initialization operations" it might be better to just use a class constructor (e.g. a wrapper).



回答6:

What you've done is perfectly legal C++. So, if it works for you and is maintainable and understandable by anybody else who works with the code, it's fine. Joachim Pileborg's sample is clearer to me though.

One problem with initializing global variables like this can occur if they use each other during initialization. In that case it can be tricky to ensure that variables are initialized in the correct order. For that reason, I prefer to create InitializeX, InitializeY, etc functions, and explicitly call them in the correct order from the Main function.

Wrong ordering can also cause problems during program exit where globals still try to use each other when some of them may have been destroyed. Again, some explicit destruction calls in the correct order before Main returns can make it clearer.

So, go for it if it works for you, but be aware of the pitfalls. The same advice applies to pretty much every feature in C++!

You said in your question that you yourself think the code is 'tricky'. There is no need to overcomplicate things for the sake of it. So, if you have an alternative that appears less 'tricky' to you... that might be better.



回答7:

When I hear "tricky code", I immediately think of code smells and maintenance nightmares. To answer your question, no, it isn't a good idea. While it is valid C++ code, it is bad practice. There are other, much more explicit and meaningful alternatives to this problem. To elaborate, the fact that your initializer() method returns void* NULL is meaningless as far as the intention of your program goes (i.e. each line of your code should have meaningful purpose), and you now have yet another unnecessary global variable fakeVar, which needlessly points to NULL.

Let's consider some less "tricky" alternatives:

  1. If it's extremely important that you only ever have one global instance of myMap, perhaps using the Singleton Pattern would be more fitting, and you would be able to lazily initialize the contents of myMap when they are needed. Keep in mind that the Singleton Pattern has issues of its own.

  2. Have a static method create and return the map or use a global namespace. For example, something along the lines of this:

    // global.h
    namespace Global
    {
        extern std::map<std::string, int> myMap;
    };
    
    // global.cpp
    namespace Global
    {
        std::map<std::string, int> initMap()
        {
            std::map<std::string, int> map;
            map["test"] = 222;
            return map;
        }
    
        std::map<std::string, int> myMap = initMap();
    };
    
    // main.cpp
    #include "global.h"
    
    int main()
    {
       std::cout << Global::myMap.size() << std::endl;
       return 0;
    }
    
  3. If this is a map with specialized functionality, create your own class (best option)! While this isn't a complete example, you get the idea:

    class MyMap
    {
    private:
        std::map<std::string, int> map;
    
    public:
    
        MyMap()
        {
            map["test"] = 222;
        }
    
        void put(std::string key, int value)
        {
            map[key] = value;
        }
    
        unsigned int size() const
        {
            return map.size();
        }
    
        // Overload operator[] and create any other methods you need
        // ...
    };
    
    MyMap myMap;
    
    int main()
    {
       std::cout << myMap.size() << std::endl;
       return 0;
    }
    


回答8:

In C++, you cannot have statements outside any function. However, you have global objects declared, and constructor (initializer) call for these global objects are automatic before main starts. In your example, fakeVar is a global pointer that gets initialized through a function of class static scope, this is absolutely fine.
Even a global object would do provide that global object constructor does the desired initializaton. For example,

class Fake
{
public:
    Fake()     {
        myMap["test"]=222;
        // Do whatever with your global Variables
    }
};
Fake fake; 


回答9:

This is a case where unity builds (single translation unit builds) can be very powerful. The __COUNTER__ macro is a de facto standard among C and C++ compilers, and with it you can write arbitrary imperative code at global scope:

// At the beginning of the file...
template <uint64_t N> void global_function() { global_function<N - 1>(); } // This default-case skips "gaps" in the specializations, in case __COUNTER__ is used for some other purpose.
template <> void global_function<__COUNTER__>() {} // This is the base case.

void run_global_functions();

#define global_n(N, ...) \
template <> void global_function<N>() { \
    global_function<N - 1>(); /* Recurse and call the previous specialization */ \
    __VA_ARGS__; /* Run the user code. */ \
}
#define global(...) global_n(__COUNTER__, __VA_ARGS__)

// ...

std::map<std::string, int> myMap;

global({
    myMap["test"]=222;
    // Do whatever with your global variables
})
global(myMap["Error"] = 111);

int main() {
    run_global_functions();
    std::cout << "Map size: " << myMap.size() << std::endl; // Show myMap has initialized correctly :)
}

global(std::cout << "This will be the last global code run before main!");


// ...At the end of the file

void run_global_functions() {
    global_function<__COUNTER__ - 1>();
}

This is especially powerful once you realize that you can use it to initialize static variables without a dependency on the C runtime. This means you can generate very small executables without having to eschew non-zero global variables:

// At the beginning of the file...
extern bool has_static_init;
#define default_construct(x) x{}; global(if (!has_static_init()) new (&x) decltype(x){})
// Or if you don't want placement new:
// #define default_construct(x) x{}; global(if (!has_static_init()) x = decltype(x){})

class Complicated {
    int x = 42;
    Complicated() { std::cout << "Constructor!"; }
}
Complicated default_construct(my_complicated_instance); // Will be zero-initialized if the CRT is not linked into the program.

int main() {
    run_global_functions();
}

// ...At the end of the file
static bool get_static_init() {
    volatile bool result = true; // This function can't be inlined, so the CRT *must* run it.
    return result;
}
has_static_init = get_static_init(); // Will stay zero without CRT


回答10:

This answer is similar to Some programmer dude's answer, but may be considered a bit cleaner. As of C++17 (that's when std::invoke() was added), you could do something like this:

#include <functional>

auto initializer = std::invoke([]() {
    // Do initialization here...

    // The following return statement is arbitrary. Without something like it,
    // the auto will resolve to void, which will not compile:
    return true;
});