C/C++ header and implementation files: How do they

2019-01-08 09:05发布

This is probably a stupid question, but I've searched for quite a while now here and on the web and couldn't come up with a clear answer (did my due diligence googling).

So I'm new to programming... My question is, how does the main function know about function definitions (implementations) in a different file?

ex. Say I have 3 files

  • main.cpp
  • myfunction.cpp
  • myfunction.hpp

//main.cpp

#include "myfunction.hpp"
int main() {
  int A = myfunction( 12 );
  ...
}

-

//myfunction.cpp

#include "myfunction.hpp"
int myfunction( int x ) {
  return x * x;
}

-

//myfunction.hpp

int myfunction( int x );

-

I get how the preprocessor includes the header code, but how do the header and main function even know the function definition exists, much less utilize it?

I apologize if this isn't clear or I'm vastly mistaken about something, new here

7条回答
放我归山
2楼-- · 2019-01-08 10:05

1. The principle

When you write:

int A = myfunction(12);

This is translated to:

int A = @call(myfunction, 12);

where @call can be seen as a dictionary look-up. And if you think about the dictionary analogy, you can certainly know about a word (smogashboard ?) before knowing its definition. All you need is that, at runtime, the definition be in the dictionary.

2. A point on ABI

How does this @call work ? Because of the ABI. The ABI is a way that describes many things, and among those how to perform a call to a given function (depending on its parameters). The call contract is simple: it simply says where each of the function arguments can be found (some will be in the processor's registers, some others on the stack).

Therefore, @call actually does:

@push 12, reg0
@invoke myfunction

And the function definition knows that its first argument (x) is located in reg0.

3. But I though dictionaries were for dynamic languages ?

And you are right, to an extent. Dynamic languages are typically implemented with a hash table for symbol lookup that is dynamically populated.

For C++, the compiler will transform a translation unit (roughly speaking, a preprocessed source file) into an object (.o or .obj in general). Each object contains a table of the symbols it references but for which the definition is not known:

.undefined
[0]: myfunction

Then the linker will bring together the objects and reconciliate the symbols. There are two kinds of symbols at this point:

  • those which are within the library, and can be referenced through an offset (the final address is still unknown)
  • those which are outside the library, and whose address is completely unknown until runtime.

Both can be treated in the same fashion.

.dynamic
[0]: myfunction at <undefined-address>

And then the code will reference the look-up entry:

@invoke .dynamic[0]

When the library is loaded (DLL_Open for example), the runtime will finally know where the symbol is mapped in memory, and overwrite the <undefined-address> with the real address (for this run).

查看更多
登录 后发表回答