編譯器入門：沒有siri的那些年，我們如何實現人機對話？

李亞洲發表於2017-08-20

原文網址 : http://www.jiqizhixin.com/articles/2017-08-20-3

編譯器可將原始碼轉換成計算機理解的可執行的機器程式碼，或將原始碼轉換成另一種程式語言。本文從 LLVM 入手介紹了編譯器工具。

編譯器不過就是一個翻譯其它程式的程式。傳統的編譯器將原始碼轉換成計算機可理解的可執行的機器程式碼。（一些編譯器將原始碼轉換為另一種程式語言，這些編譯器被稱為源到源轉換器或轉譯器）。LLVM 是一個廣泛使用的編譯器專案，包括多個模組化的編譯器工具。

傳統的編譯器設計包括三個部分：

編譯器入門：沒有siri的那些年，我們如何實現人機對話？

前端將原始碼轉換成一種中間表示（IR）。clang (http://clang.llvm.org/) 是 LLVM 專案中 C 類語言的前端工具。
優化器解析 IR 並將其轉換成一種更高效的形式。opt是 LLVM 專案的優化器工具。
後端通過將 IR 對映到目標硬體指令集上來生成機器程式碼。llc 是 LLVM 專案的後端工具。

LLVM IR 是一種類似彙編的低階語言。但是，它不針對特定的硬體資訊程式設計。

你好，編譯器

下面是一個簡單的列印「Hello，Compiler」字串的 C 語言程式。雖然程式設計師可以讀懂 C 語言語法，但是計算機卻看的一臉懵逼。接下來我要過一遍編譯的三個階段，以便將以下程式轉換成機器可執行的程式。

// compile_me.c
// Wave to the compiler. The world can wait.#include <stdio.h>int main() {
  printf("Hello, Compiler!\n");
  return 0;}

前端

前文講到，clang 是 LLVM C 類語言的前端工具。Clang 由一個 C 前處理器、詞法分析器（lexer）、解析器、語義分析器和中間表示生成器組成。

C 前處理器在原始碼轉換成 IR 之前對其進行修改。前處理器會將外部檔案包含進來，比如上面的 #include <stdio.h>。它會用 C 標準庫檔案 stdio.h 的所有程式碼替換 #include <stdio.h> 這一行，stdio.h 標頭檔案包含了 printf 函式的宣告。通過執行以下命令觀察前處理器的輸出：

clang -E compile_me.c -o preprocessed.i

詞法分析器（Lexer，也叫 scanner 或 tokenizer）將一串字元轉換成一串詞。每個詞或符號，按其屬性被分配到對應的句法類別：標點符號、關鍵詞、識別符號、常量或註釋。

compile_me.c 的詞法分析：

編譯器入門：沒有siri的那些年，我們如何實現人機對話？

解析器判定由詞法分析器生成的一串詞是否包含源語言中的有效語句。在分析完詞的語法以後，解析器輸出了一個抽象語法樹（AST）。Clang AST 中的節點分別表示宣告與型別。

compile_me.c 的 AST：

編譯器入門：沒有siri的那些年，我們如何實現人機對話？

語義分析器遍歷 AST，判定語句的涵義是否有效。這個階段會檢查型別錯誤。如果 compile_me.c 中的 main 函式返回了 "zero" 而不是 0, 語義分析器就會丟擲一個錯誤，因為 "zero" 不是 int 型別。

IR 生成器將 AST 轉換為 IR。

在 compile_me.c 上執行 clang 前端，生成 LLVM IR：

clang -S -emit-llvm -o llvm_ir.ll compile_me.c

llvm_ir.ll 中的 main 函式：

; llvm_ir.ll

@.str = private unnamed_addr constant [18 x i8] c"Hello, Compiler!\0A\00", align 1

define i32 @main() {
  %1 = alloca i32, align 4 ; <- memory allocated on the stack
  store i32 0, i32* %1, align 4
  %2 = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([18 x i8], [18 x i8]* @.str, i32 0, i32 0)) ret i32 0

}

declare i32 @printf(i8*, ...)

優化器

優化器的任務是基於對程式執行時行為的理解，提升程式碼的效率。優化器的輸入為 IR，輸出為優化後的 IR。LLVM 的優化器工具 opt 將使用 -O2（大寫字母 o，數字 2）標記優化處理器速度，使用-Os（大寫字母 o，s）標記優化生成目標的大小。

看一下優化器優化之前的 LLVM IR 程式碼和優化後的程式碼：

opt -O2 -S llvm_ir.ll -o optimized.ll

optimized.ll 的 main 函式：

; optimized.ll

@str = private unnamed_addr constant [17 x i8] c"Hello, Compiler!\00"

define i32 @main() {

  %puts = tail call i32 @puts(i8* getelementptr inbounds ([17 x i8], [17 x i8]* @str, i64 0, i64 0)) ret i32 0

}

declare i32 @puts(i8* nocapture readonly)

優化後，main 函式沒有在棧上分配記憶體，因為它沒有使用任何記憶體。優化後的程式碼呼叫了 puts 函式而不是 printf 函式，因為它沒有使用 printf 函式的任何格式化功能。當然了，優化器不僅僅知道什麼時候該用 puts 代替 printf。優化器也會展開迴圈，內聯簡單計算的結果。思考以下程式碼，它將兩個數加起來並列印結果：

// add.c
#include <stdio.h>

int main() {
  int a = 5, b = 10, c = a + b;
  printf("%i + %i = %i\n", a, b, c);
}

未優化的 LLVM IR：

@.str = private unnamed_addr constant [14 x i8] c"%i + %i = %i\0A\00", align 1

define i32 @main() {
  %1 = alloca i32, align 4 ; <- allocate stack space for var a
  %2 = alloca i32, align 4 ; <- allocate stack space for var b
  %3 = alloca i32, align 4 ; <- allocate stack space for var c
  store i32 5, i32* %1, align 4  ; <- store 5 at memory location %1
  store i32 10, i32* %2, align 4 ; <- store 10 at memory location %2
  %4 = load i32, i32* %1, align 4 ; <- load the value at memory address %1 into register %4
  %5 = load i32, i32* %2, align 4 ; <- load the value at memory address %2 into register %5
  %6 = add nsw i32 %4, %5 ; <- add the values in registers %4 and %5. put the result in register %6
  store i32 %6, i32* %3, align 4 ; <- put the value of register %6 into memory address %3
  %7 = load i32, i32* %1, align 4 ; <- load the value at memory address %1 into register %7
  %8 = load i32, i32* %2, align 4 ; <- load the value at memory address %2 into register %8
  %9 = load i32, i32* %3, align 4 ; <- load the value at memory address %3 into register %9
  %10 = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([14 x i8], [14 x i8]* @.str, i32 0, i32 0), i32 %7, i32 %8, i32 %9)
  ret i32 0
}

declare i32 @printf(i8*, ...)

優化後的 LLVM IR：

@.str = private unnamed_addr constant [14 x i8] c"%i + %i = %i\0A\00", align 1

define i32 @main() {
  %1 = tail call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([14 x i8], [14 x i8]* @.str, i64 0, i64 0), i32 5, i32 10, i32 15)
  ret i32 0
}

declare i32 @printf(i8* nocapture readonly, ...)

優化後的 main 函式實際上就是在未優化版本的 17 和 18 行將變數進行內聯。opt 對加法進行運算，因為所有的變數都是常量。很酷吧？

後端

LLVM 的後端工具是 llc。它經歷了三個階段，最終把 LLVM IR 輸入轉化生成機器程式碼：

指令選取（instruction selection）是從 IR 指令到目標機器指令集的對映。這一步使用了虛擬暫存器一個無限的名稱空間。
暫存器分配（register allocation）是從虛擬暫存器到目標架構真實暫存器的對映。我的 CPU 是 x86 架構的，也就是說只能使用 16 個暫存器。但是，編譯器會盡可能少地使用暫存器。
指令排程（instruction scheduling）是對操作的重新安排，它反映了目標機器上的效能限制。

執行以下命令將生成部分機器程式碼！

llc -o compiled-assembly.s optimized.ll

_main:
	pushq	%rbp
	movq	%rsp, %rbp
	leaq	L_str(%rip), %rdi
	callq	_puts
	xorl	%eax, %eax
	popq	%rbp
	retq
L_str:
	.asciz	"Hello, Compiler!"

這是一個 x86 組合語言程式，是計算機和程式設計師共通的語言。看似晦澀，但肯定有人懂我。

編譯器入門：沒有siri的那些年，我們如何實現人機對話？

你好，編譯器

前端

優化器

後端

相關資源

相關文章