Translation unit v.s Compilation unit

JUAN425發表於2014-08-06

一句話, 二者表示的是the same things。 關於這個term的定義, C++和C 語言的相關的定義基本沒有保持一致(unchanged)。

C的早期標準的定義如下:

C89 2.1.1.1

The text of the program is kept in units called source files in this Standard. A source file together with all the headers and source files included via the preprocessing directive #include , less (除掉)any source lines skipped by any of the conditional inclusion preprocessing directives, is called a translation unit。

在C++ 的最新的標準中, 定義如下:

C++14 (CD n3690) 2.1[lex.separate]/1

The text of the program is kept in units called source files in this International Standard. A source file together with all the headers and source files included via the preprocessing directive #include, less any source lines skipped by any of the conditional inclusion preprocessing directives, is called a translation unit.

在C語言標準中, 沒有compilation unit 這個term,  compilation unit這個term只出現在C++ 的某些地方。 目的是為了將C++和C的語言區別開來。 其實和translation指的是同一個東西。

我們的編譯器的任務collects 程式的原始檔和#include連同出現的編譯預處理包含進來的的標頭檔案, 去掉原始檔中的條件控制編譯語句等, 組成一個translation unit, 然後將這個translation unit 編譯成為目標檔案(.o), 注: 目標檔案(.o file)內容就是由object code , 是relocatable format machine code that is usually not directly executable. 即.o 檔案是可以重新配置的機器程式碼, 通常並不是可以直接執行的。 .o 檔案是有compiler 產生的。 用於作為linker(連結器)的輸入, linker 利用輸入的各種.o 檔案, 組裝成一個可執行(executable)的檔案或者庫(library)。(by combining parts of object files)。


下面舉一個解釋C++程式從預處理, 到編譯, 再到連結生成可執行檔案的各個步驟。

When you invoke your C++ compiler,
the C preprocessor accepts a source file and emits a translation unit,
the C++ compiler proper accepts the translation unit
and emits assembler code(彙編碼),
the assembler accepts the assembler code and emits machine code and
the link editor(linker)accepts the machine code,
loads it into an executable program file
along with the required objects from library archives
and resolves all of the links.

When the C preprocessor reads your source file,
it includes the header files in the translation unit,
reads and processes all of the macros
then discards all of the macros when it has finished.
It does *not* remember macro definitions
when it processes the next source files
so, if the next source file includes the same header file,
the header file will be read again
and any external function definition in that header
will be included in the next translation unit as well.
The link editor will discover multiple function definitions(多個函式定義(違反了C++ 的ODR(one definition rule)規則))
if it trys to link the resulting machine code files together(會發生編譯器連結報錯).
If, instead, you qualify the function definition as inline or static,
the compiler will label them as "local" links(區域性連結, 所以當然不會報錯多個函式定義)
so the link editor will not complain.

再看看另一個解答

I highly suggest that this be added to the FAQ.
You have Source Code files:
a.cpp
b.cpp
s.cpp

You have 1 Header file:

b.hpp

Both a.cpp and s.cpp include b.hpp.
The three source code files get compiled into object files:

a.obj b.obj s.obj

And they're passed on to the linker.

不同的原始檔在編譯的時候是獨立的生成.o 檔案。
The linker (連結器)sees a function, Monkey, in a.obj AND in s.obj, hence a multiple
definition.


So how do you get away with putting inline functions into a header file?
They have internal linkage, ie. these functions aren't presented to the
linker. The "static" is implied in inline functions.

  1. Ok, the C/C++compiler 
actually works in a series of stages, more or less like this: The pre-processor - strips out comments, expands #define macros and #include lines, etc.The compiler - parses your source code and builds assembly output from itThe assembler - takes that assembly code and builds an object file out of itThe linker - takes some of those object files and such, and builds an executable out of itThe loader - not generally part of the compiler suite, but part of the OS. Takes your executable and actually tries to load and run it. This may be more complicated than it sounds - for instance, your program might have unresolved symbols in it even after the linker goes over it, for instance if you used a shared library (eg. .so in unix or .dll in windows) which is supposed to be linked in at run time.


A compiler's job is really to transform your code from the human-readable form (source code) to a form which means something from the computer's standpoint. This doesn't necessarily mean "executable" To be specific, your C compiler is actually supposed to turn C source code into object files. An object file is an intermediate form, where all the code has been understood and processed by the compiler, but it's not ready to run yet because it's only the code from one file. If your program is made up of multiple files, each file gets built into an object file, and there are calls between the object files that have not been resolved yet.

Hence, the object files still need to be linked together before all the symbols make sense. For example:

Suppose you have 2 files, main.c and spoo.c (plus a header for spoo), where there's a function in spoo.c called, oh, shall we say, spoo() which is called from themain.c file. Well, when you go to compile main.c and spoo.c into their respective object files (main.o and spoo.o) the compiler isn't going to look at the other source file, so when it builds main, it doesn't know what to make of the spoo function. It knows the spoo function exists, because the header declared "Hey, there's this function void spoo(void);" but it doesn't know what spoo actually does, just that there is a function called spoo. So, it basically puts in a placeholder for spoo. Eg, it puts "I'm calling void spoo(void) here, but I don't know what it is" in main.o when it compiles it.

So, how does one actually get a program that runs? The linker to the rescue! The linker is a program that, given a bunch of object files, resolves all the unresolved symbols (those place holders) in them and produces your actual executable.

What's up with g++ actually producting executables then? Well, g++ is technically a group of programs, one of which is the compiler, one of which is the linker, etc. When you invoke it on a source file, it typically assumes you want an executable, so it calls the linker for you. Nice ot it, no?







相關文章