Why do we need extern “C”{ #include } in C++

2019-01-02 14:14发布

问题:

This question already has an answer here:

  • What is the effect of extern “C” in C++? 13 answers

Why do we need to use:

extern "C" {
#include <foo.h>
}

Specifically:

  • When should we use it?

  • What is happening at the compiler/linker level that requires us to use it?

  • How in terms of compilation/linking does this solve the problems which require us to use it?

回答1:

C and C++ are superficially similar, but each compiles into a very different set of code. When you include a header file with a C++ compiler, the compiler is expecting C++ code. If, however, it is a C header, then the compiler expects the data contained in the header file to be compiled to a certain format—the C++ 'ABI', or 'Application Binary Interface', so the linker chokes up. This is preferable to passing C++ data to a function expecting C data.

(To get into the really nitty-gritty, C++'s ABI generally 'mangles' the names of their functions/methods, so calling printf() without flagging the prototype as a C function, the C++ will actually generate code calling _Zprintf, plus extra crap at the end.)

So: use extern "C" {...}; when including a c header—it's that simple. Otherwise, you'll have a mismatch in compiled code, and the linker will choke. For most headers, however, you won't even need the extern because most system C headers will already account for the fact that they might be included by C++ code and already extern their code.



回答2:

extern "C" determines how symbols in the generated object file should be named. If a function is declared without extern "C", the symbol name in the object file will use C++ name mangling. Here's an example.

Given test.C like so:

void foo() { }

Compiling and listing symbols in the object file gives:

$ g++ -c test.C
$ nm test.o
0000000000000000 T _Z3foov
                 U __gxx_personality_v0

The foo function is actually called "_Z3foov". This string contains type information for the return type and parameters, among other things. If you instead write test.C like this:

extern "C" {
    void foo() { }
}

Then compile and look at symbols:

$ g++ -c test.C
$ nm test.o
                 U __gxx_personality_v0
0000000000000000 T foo

You get C linkage. The name of the "foo" function in the object file is just "foo", and it doesn't have all the fancy type info that comes from name mangling.

You generally include a header within extern "C" {} if the code that goes with it was compiled with a C compiler but you're trying to call it from C++. When you do this, you're telling the compiler that all the declarations in the header will use C linkage. When you link your code, your .o files will contain references to "foo", not "_Z3fooblah", which hopefully matches whatever is in the library you're linking against.

Most modern libraries will put guards around such headers so that symbols are declared with the right linkage. e.g. in a lot of the standard headers you'll find:

#ifdef __cplusplus
extern "C" {
#endif

... declarations ...

#ifdef __cplusplus
}
#endif

This makes sure that when C++ code includes the header, the symbols in your object file match what's in the C library. You should only have to put extern "C" {} around your C header if it's old and doesn't have these guards already.



回答3:

In C++, you can have different entities that share a name. For example here is a list of functions all named foo:

  • A::foo()
  • B::foo()
  • C::foo(int)
  • C::foo(std::string)

In order to differentiate between them all, the C++ compiler will create unique names for each in a process called name-mangling or decorating. C compilers do not do this. Furthermore, each C++ compiler may do this is a different way.

extern "C" tells the C++ compiler not to perform any name-mangling on the code within the braces. This allows you to call C functions from within C++.



回答4:

It has to do with the way the different compilers perform name-mangling. A C++ compiler will mangle the name of a symbol exported from the header file in a completely different way than a C compiler would, so when you try to link, you would get a linker error saying there were missing symbols.

To resolve this, we tell the C++ compiler to run in "C" mode, so it performs name mangling in the same way the C compiler would. Having done so, the linker errors are fixed.



回答5:

When should we use it?

When you are linking C libaries into C++ object files

What is happening at the compiler/linker level that requires us to use it?

C and C++ use different schemes for symbol naming. This tells the linker to use C's scheme when linking in the given library.

How in terms of compilation/linking does this solve the problems which require us to use it?

Using the C naming scheme allows you to reference C-style symbols. Otherwise the linker would try C++-style symbols which wouldn't work.



回答6:

C and C++ have different rules about names of symbols. Symbols are how the linker knows that the call to function "openBankAccount" in one object file produced by the compiler is a reference to that function you called "openBankAccount" in another object file produced from a different source file by the same (or compatible) compiler. This allows you to make a program out of more than one source file, which is a relief when working on a large project.

In C the rule is very simple, symbols are all in a single name space anyway. So the integer "socks" is stored as "socks" and the function count_socks is stored as "count_socks".

Linkers were built for C and other languages like C with this simple symbol naming rule. So symbols in the linker are just simple strings.

But in C++ the language lets you have namespaces, and polymorphism and various other things that conflict with such a simple rule. All six of your polymorphic functions called "add" need to have different symbols, or the wrong one will be used by other object files. This is done by "mangling" (that's a technical term) the names of symbols.

When linking C++ code to C libraries or code, you need extern "C" anything written in C, such as header files for the C libraries, to tell your C++ compiler that these symbol names aren't to be mangled, while the rest of your C++ code of course must be mangled or it won't work.



回答7:

You should use extern "C" anytime that you include a header defining functions residing in a file compiled by a C compiler, used in a C++ file. (Many standard C libraries may include this check in their headers to make it simpler for the developer)

For example, if you have a project with 3 files, util.c, util.h, and main.cpp and both the .c and .cpp files are compiled with the C++ compiler (g++, cc, etc) then it isn't really needed, and may even cause linker errors. If your build process uses a regular C compiler for util.c, then you will need to use extern "C" when including util.h.

What is happening is that C++ encodes the parameters of the function in its name. This is how function overloading works. All that tends to happen to a C function is the addition of an underscore ("_") to the beginning of the name. Without using extern "C" the linker will be looking for a function named DoSomething@@int@float() when the function's actual name is _DoSomething() or just DoSomething().

Using extern "C" solves the above problem by telling the C++ compiler that it should look for a function that follows the C naming convention instead of the C++ one.



回答8:

The extern "C" {} construct instructs the compiler not to perform mangling on names declared within the braces. Normally, the C++ compiler "enhances" function names so that they encode type information about arguments and the return value; this is called the mangled name. The extern "C" construct prevents the mangling.

It is typically used when C++ code needs to call a C-language library. It may also be used when exposing a C++ function (from a DLL, for example) to C clients.



回答9:

The C++ compiler creates symbol names differently than the C compiler. So, if you are trying to make a call to a function that resides in a C file, compiled as C code, you need to tell the C++ compiler that the symbol names that it is trying to resolve look different than it defaults to; otherwise the link step will fail.



回答10:

This is used to resolve name mangling issues. extern C means that the functions are in a "flat" C-style API.



标签: