LLVM translation unit

2019-04-11 18:05发布

问题:

I try to understand LLVM program high level structure. I read in the book that "programs are composed of modules ,each of which correspons to tranlation unit".Can someone explain me in more details the above and what is the diffrenece between modules and translation units(if any). I am also interested to know which part of the code is called when translation unit starts and completes debugging information encoding?

回答1:

Translation unit is term from language standard. For example, this is from C (c99 iso draft)

5.1 Conceptual models; 5.1.1 Translation environment; 5.1.1.1 Program structure

A C program need not all be translated at the same time. The text of the program is kept in units called source files, (or preprocessing files) in this International Standard. A source file together with all the headers and source files included via the preprocessing directive #include is known as a preprocessing translation unit. After preprocessing, a preprocessing translation unit is called a translation unit.

So, translation unit is the single source file (file.c) after preprocessing (all #included *.h files instantiated, all macro are expanded, all comments are skipped, and file is ready for tokenizing).

Translation unit is a unit of compiling, because it didn't depend on any external resource until linking step. All headers are within TU.

Term module is not defined in the language standard, but it AFAIK refers to translation unit at deeper translation phases.

LLVM describes it as: http://llvm.org/docs/ProgrammersManual.html

The Module class represents the top level structure present in LLVM programs. An LLVM module is effectively either a translation unit of the original program or a combination of several translation units merged by the linker.

The Module class keeps track of a list of Functions, a list of GlobalVariables, and a SymbolTable. Additionally, it contains a few helpful member functions that try to make common operations easy.

About this part of your question:

I am also interested to know which part of the code is called when translation unit starts and completes debugging information encoding?

This depends on how LLVM is used. LLVM itself is a library and can be used in various ways.

For clang/LLVM (C/C++ complier build on libclang and LLVM) the translation unit created after preprocessing stage. It will be parsed into AST, then into LLVM assembly and saved in Module.

For tutorial example, here is a creation of Modules http://llvm.org/releases/2.6/docs/tutorial/JITTutorial1.html