理解Babel是如何編譯JS程式碼的及理解抽象語法樹(AST）

龍恩0707發表於2017-11-20

Babel是如何編譯JS程式碼的及理解抽象語法樹(AST）

1. Babel的作用是？
很多瀏覽器目前還不支援ES6的程式碼，但是我們可以通過Babel將ES6的程式碼轉譯成ES5程式碼，讓所有的瀏覽器都能理解的程式碼，這就是Babel的作用。
2. Babel是如何工作的？
Babel的編譯過程和大多數其他語言的編譯器大致相同，可以分為三個階段。
1. 解析(PARSE)：將程式碼字串解析成抽象語法樹。
2. 轉換(TRANSFORM)：對抽象語法樹進行轉換操作。
3. 生成(GENERATE): 根據變換後的抽象語法樹再生成程式碼字串。
比如我們在 .babelrc裡配置的presets和plugins是在第二步進行的。
我們可以看一下下面的流程圖就可以很清晰了：

3. 什麼是抽象語法樹(AST)?
我們知道javascript程式一般是由一系列的字元組成的，每一個字元都有一些含義，比如我們可以使用匹配的字元([], {}, ()), 或一些其他成對的字元('', "")和程式碼縮排讓程式解析更加簡單，但是對計算機並不適用，這些字元在記憶體中僅僅是個數值，但是計算機並不知道一個程式內部有多少個變數這些高階問題，
這個時候我們需要尋找一些能讓計算機理解的方式，這個時候，抽象語法樹誕生了。

4. 抽象語法樹是如何產生的？
我們通過上面知道，Babel的工作的第一步是解析操作，將程式碼字串解析成抽象語法樹，那麼抽象語法樹就是在解析過程中產生的。其實解析又可以分成兩個
步驟：
4-1 分詞： 將整個程式碼字串分割成語法單元陣列。
4-2 語義分析：在分詞結果的基礎之上分析語法單元之間的關係。

分詞：
先來理解一下什麼是語法單元？語法單元是被解析語法當中具備實際意義的最小單元，簡單的來理解就是自然語言中的詞語。
比如我們來看下面的一句話：
2022年亞運會將在杭州舉行，下面我們可以把這句話拆分成最小單元：2022年, 亞運會, 將, 在, 杭州, 舉行。這就是我們所說的分詞。也是最小單元，
如果我們把它再拆分出去的話，那就沒有什麼實際意義了。

那麼JS程式碼中有哪些語法單元呢？大致有下面這些：
1. 空白。JS中連續的空格，換行，縮排等這些如果不在字串裡面，就沒有任何實際的意義，因此我們可以將連續的空白組合在一起作為一個語法單元。
2. 註釋。行註釋或塊註釋，對於編寫人或維護人註釋是有意義的，但是對於計算機來說知道這是個註釋就可以了，並不關心註釋的含義，因此我們可以將
註釋理解為一個不可拆分的語法單元。
3. 字串。對計算機而言，字串的內容會參與計算或顯示，因此有可以為一個語法單元。
4. 數字。JS中有16，10，8進位制以及科學表示式等語法，因此數字也可以理解一個語法單元。
5. 識別符號。沒有被引號括起來的連續字元，可包含字母 _, $ 及數字，或 true, false等這些內建常量，或 if，return，function等這些關鍵字。
6. 運算子： +, -, *, /, >, < 等。
7，還有一些其他的，比如括號，中括號，大括號，分號，冒號，點等等。

下面我們來看看程式碼內是如何分詞的？
比如如下程式碼：

if (1 > 0) {
  alert("aa");
}

我們希望得到的分詞是如下：

'if' ' ' '(' '1' ' ' '>' ' ' '0' )' ' ' '{' '\n ' 'alert' '(' "aa" ')' ";" '\n' '}'

下面我們就來一個個字元進行遍歷，然後分情況判斷，如下程式碼：

<!DOCTYPE html>
<html>
  <head>
    <title>分詞</title>
  </head>
  <body>
    <script>
      function tokenizeCode(code) {
        var tokens = [];  // 儲存結果陣列
        for (var i = 0; i < code.length; i++) {
          // 從0開始 一個個字元讀取
          var currentChar = code.charAt(i);
          if (currentChar === ';') {
            tokens.push({
              type: 'sep',
              value: currentChar
            });
            // 該字元已經得到解析了，直接迴圈下一個
            continue;
          }
          if (currentChar === '(' || currentChar === ')') {
            tokens.push({
              type: 'parens',
              value: currentChar
            });
            continue;
          }
          if (currentChar === '{' || currentChar === '}') {
            tokens.push({
              type: 'brace',
              value: currentChar
            });
            continue;
          }
          if (currentChar === '>' || currentChar === '<') {
            tokens.push({
              type: 'operator',
              value: currentChar
            });
            continue;
          }
          if (currentChar === '"' || currentChar === '\'') {
            // 如果是單引號或雙引號，表示一個字元的開始
            var token = {
              type: 'string',
              value: currentChar
            };
            tokens.push(token);
            var closer = currentChar;

            // 表示下一個字元是不是被轉譯了
            var escaped = false;
            // 迴圈遍歷 尋找字串的末尾
            for(i++; i < code.length; i++) {
              currentChar = code.charAt(i);
              // 將當前遍歷到的字元先加到字串內容中
              token.value += currentChar;
              if (escaped) {
                // 如果當前為true的話，就變為false，然後該字元就不做特殊的處理
                escaped = false;
              } else if (currentChar === '\\') {
                // 如果當前的字元是 \, 將轉譯狀態變為true，下一個字元不會被做處理
                escaped = true;
              } else if (currentChar === closer) {
                break;
              }
            }
            continue;
          }

          // 數字做處理 
          if (/[0-9]/.test(currentChar)) {
            // 如果數字是以 0 到 9的字元開始的話
            var token = {
              type: 'number',
              value: currentChar
            };
            tokens.push(token);
            // 繼續遍歷，如果下一個字元還是數字的話，比如0到9或小數點的話
            for (i++; i < code.length; i++) {
              currentChar = code.charAt(i);
              if (/[0-9\.]/.test(currentChar)) {
                // 先不考慮多個小數點 或 進位制的情況下
                token.value += currentChar;
              } else {
                // 如果下一個字元不是數字的話，需要把i值返回原來的位置上，需要減1
                i--;
                break;
              }
            }
            continue;
          }
          // 識別符號是以字母，$, _開始的 做判斷
          if (/[a-zA-Z\$\_]/.test(currentChar)) {
            var token = {
              type: 'identifier',
              value: currentChar
            };
            tokens.push(token);
            // 繼續遍歷下一個字元，如果下一個字元還是以字母，$,_開始的話
            for (i++; i < code.length; i++) {
              currentChar = code.charAt(i);
              if (/[a-zA-Z0-9\$\_]/.test(currentChar)) {
                token.value += currentChar;
              } else {
                i--;
                break;
              }
            }
            continue;
          }

          // 連續的空白字元組合在一起
          if (/\s/.test(currentChar)) {
            var token = {
              type: 'whitespace',
              value: currentChar
            }
            tokens.push(token);
            // 繼續遍歷下一個字元
            for (i++; i < code.length; i++) {
              currentChar = code.charAt(i);
              if (/\s/.test(currentChar)) {
                token.value += currentChar;
              } else {
                i--;
                break;
              }
            }
            continue;
          }
          // 更多的字元判斷 ......
          // 遇到無法理解的字元 直接丟擲異常
          throw new Error('Unexpected ' + currentChar);
        }
        return tokens;
      } 
      var tokens = tokenizeCode(`
        if (1 > 0) {
          alert("aa");
        }
      `);
      console.log(tokens);
    </script>
  </body>
</html>

列印的結果如下：

/*
  [
    {type: "whitespace", value: "\n"},
    {type: "identifier", value: "if"},
    {type: "whitespace", value: " "},
    {type: "parens", value: "("},
    {type: "number", value: "1"},
    {type: "whitespace", value: " "},
    {type: "operator", value: ">"},
    {type: "whitespace", value: " "},
    {type: "number", value: "0"},
    {type: "parens", value: ")"},
    {type: "whitespace", value: " "},
    {type: "brace", value: "{"},
    {type: "whitespace", value: "\n"},
    {type: "identifier", value: "alert"},
    {type: "parens", value: "("},
    {type: "string", value: "'aa'"},
    {type: "parens", value: ")"},
    {type: "sep", value: ";"},
    {type: "whitespace", value: "\n"},
    {type: "brace", value: "}"},
    {type: "whitespace", value: "\n"}
  ]
*/

github控制檯檢視效果

語義分析：

語義分析是把詞彙進行立體的組合，確定有多重意義的詞語最終是什麼意思，多個詞語之間有什麼關係以及又如何在什麼地方斷句等等。我們對上面的輸出程式碼再進行語義分析了，請看如下程式碼：

<!DOCTYPE html>
<html>
  <head>
    <title>分詞</title>
  </head>
  <body>
    <script>
      var parse = function(tokens) {
        let i = -1;     // 用於標識當前遍歷位置
        let curToken;   // 用於記錄當前符號
        // 讀取下一個語句
        function nextStatement () {

          // 暫存當前的i，如果無法找到符合條件的情況會需要回到這裡
          stash();
          
          // 讀取下一個符號
          nextToken();
          if (curToken.type === 'identifier' && curToken.value === 'if') {
            // 解析 if 語句
            const statement = {
              type: 'IfStatement',
            };
            // if 後面必須緊跟著 (
            nextToken();
            if (curToken.type !== 'parens' || curToken.value !== '(') {
              throw new Error('Expected ( after if');
            }

            // 後續的一個表示式是 if 的判斷條件
            statement.test = nextExpression();

            // 判斷條件之後必須是 )
            nextToken();
            if (curToken.type !== 'parens' || curToken.value !== ')') {
              throw new Error('Expected ) after if test expression');
            }

            // 下一個語句是 if 成立時執行的語句
            statement.consequent = nextStatement();

            // 如果下一個符號是 else 就說明還存在 if 不成立時的邏輯
            if (curToken === 'identifier' && curToken.value === 'else') {
              statement.alternative = nextStatement();
            } else {
              statement.alternative = null;
            }
            commit();
            return statement;
          }

          if (curToken.type === 'brace' && curToken.value === '{') {
            // 以 { 開頭表示是個程式碼塊，我們暫不考慮JSON語法的存在
            const statement = {
              type: 'BlockStatement',
              body: [],
            };
            while (i < tokens.length) {
              // 檢查下一個符號是不是 }
              stash();
              nextToken();
              if (curToken.type === 'brace' && curToken.value === '}') {
                // } 表示程式碼塊的結尾
                commit();
                break;
              }
              // 還原到原來的位置，並將解析的下一個語句加到body
              rewind();
              statement.body.push(nextStatement());
            }
            // 程式碼塊語句解析完畢，返回結果
            commit();
            return statement;
          }
          
          // 沒有找到特別的語句標誌，回到語句開頭
          rewind();

          // 嘗試解析單表示式語句
          const statement = {
            type: 'ExpressionStatement',
            expression: nextExpression(),
          };
          if (statement.expression) {
            nextToken();
            if (curToken.type !== 'EOF' && curToken.type !== 'sep') {
              throw new Error('Missing ; at end of expression');
            }
            return statement;
          }
        }
        // 讀取下一個表示式
        function nextExpression () {
          nextToken();
          if (curToken.type === 'identifier') {
            const identifier = {
              type: 'Identifier',
              name: curToken.value,
            };
            stash();
            nextToken();
            if (curToken.type === 'parens' && curToken.value === '(') {
              // 如果一個識別符號後面緊跟著 ( ，說明是個函式呼叫表示式
              const expr = {
                type: 'CallExpression',
                caller: identifier,
                arguments: [],
              };

              stash();
              nextToken();
              if (curToken.type === 'parens' && curToken.value === ')') {
                // 如果下一個符合直接就是 ) ，說明沒有引數
                commit();
              } else {
                // 讀取函式呼叫引數
                rewind();
                while (i < tokens.length) {
                  // 將下一個表示式加到arguments當中
                  expr.arguments.push(nextExpression());
                  nextToken();
                  // 遇到 ) 結束
                  if (curToken.type === 'parens' && curToken.value === ')') {
                    break;
                  }
                  // 引數間必須以 , 相間隔
                  if (curToken.type !== 'comma' && curToken.value !== ',') {
                    throw new Error('Expected , between arguments');
                  }
                }
              }
              commit();
              return expr;
            }
            rewind();
            return identifier;
          }
          if (curToken.type === 'number' || curToken.type === 'string') {
            // 數字或字串，說明此處是個常量表示式
            const literal = {
              type: 'Literal',
              value: eval(curToken.value),
            };
            // 但如果下一個符號是運算子，那麼這就是個雙元運算表示式
            stash();
            nextToken();
            if (curToken.type === 'operator') {
              commit();
              return {
                type: 'BinaryExpression',
                left: literal,
                right: nextExpression(),
              };
            }
            rewind();
            return literal;
          }
          if (curToken.type !== 'EOF') {
            throw new Error('Unexpected token ' + curToken.value);
          }
        }
        // 往後移動讀取指標，自動跳過空白
        function nextToken () {
          do {
            i++;
            curToken = tokens[i] || { type: 'EOF' };
          } while (curToken.type === 'whitespace');
        }
        // 位置暫存棧，用於支援很多時候需要返回到某個之前的位置
        const stashStack = [];
        function stash () {
          // 暫存當前位置
          stashStack.push(i);
        }
        function rewind () {
          // 解析失敗，回到上一個暫存的位置
          i = stashStack.pop();
          curToken = tokens[i];
        }
        function commit () {
          // 解析成功，不需要再返回
          stashStack.pop();
        }
        const ast = {
          type: 'Program',
          body: [],
        };
        // 逐條解析頂層語句
        while (i < tokens.length) {
          const statement = nextStatement();
          if (!statement) {
            break;
          }
          ast.body.push(statement);
        }
        return ast;
      };
      var ast = parse([
          {type: "whitespace", value: "\n"},
          {type: "identifier", value: "if"},
          {type: "whitespace", value: " "},
          {type: "parens", value: "("},
          {type: "number", value: "1"},
          {type: "whitespace", value: " "},
          {type: "operator", value: ">"},
          {type: "whitespace", value: " "},
          {type: "number", value: "0"},
          {type: "parens", value: ")"},
          {type: "whitespace", value: " "},
          {type: "brace", value: "{"},
          {type: "whitespace", value: "\n"},
          {type: "identifier", value: "alert"},
          {type: "parens", value: "("},
          {type: "string", value: "'aa'"},
          {type: "parens", value: ")"},
          {type: "sep", value: ";"},
          {type: "whitespace", value: "\n"},
          {type: "brace", value: "}"},
          {type: "whitespace", value: "\n"}
                ]);
      console.log(ast);
    </script>
  </body>
</html>

最後輸出ast值為如下：

{
  "type": "Program",
  "body": [
    {
      "type": "IfStatement",
      "test": {
        "type": "BinaryExpression",
        "left": {
          "type": "Literal",
          "value": 1
        },
        "right": {
          "type": "Literal",
          "value": 0
        }
      },
      "consequent": {
        "type": "BlockStatement",
        "body": [
          {
            "type": "ExpressionStatement",
            "expression": {
              "type": "CallExpression",
              "caller": {
                "type": "Identifier",
                "value": "alert"
              },
              "arguments": [
                {
                  "type": "Literal",
                  "value": "aa"
                }
              ]
            }
          }
        ]
      },
      "alternative": null
    }
  ]
}

我們現在再來分析下上面程式碼的含義：分析如下：

第一步呼叫parse該方法，傳入引數分詞中輸出的結果，程式碼如下：

var ast = parse([
  {type: "whitespace", value: "\n"},
  {type: "identifier", value: "if"},
  {type: "whitespace", value: " "},
  {type: "parens", value: "("},
  {type: "number", value: "1"},
  {type: "whitespace", value: " "},
  {type: "operator", value: ">"},
  {type: "whitespace", value: " "},
  {type: "number", value: "0"},
  {type: "parens", value: ")"},
  {type: "whitespace", value: " "},
  {type: "brace", value: "{"},
  {type: "whitespace", value: "\n"},
  {type: "identifier", value: "alert"},
  {type: "parens", value: "("},
  {type: "string", value: "'aa'"},
  {type: "parens", value: ")"},
  {type: "sep", value: ";"},
  {type: "whitespace", value: "\n"},
  {type: "brace", value: "}"},
  {type: "whitespace", value: "\n"}
]);

先初始化如下引數：
let i = -1; // 用於標識當前遍歷位置
let curToken; // 用於記錄當前符號

function nextStatement() {
// ... 很多程式碼
}
function nextExpression() {
// ... 很多程式碼
}
function nextToken() {
// ... 很多程式碼
}
// 位置暫存棧，用於支援很多時候需要返回到某個之前的位置
const stashStack = [];

function rewind () {
// ... 很多程式碼
}
function commit () {
// ... 很多程式碼
}
真正初始化的程式碼如下：

const ast = {
  type: 'Program',
  body: [],
};
// 逐條解析頂層語句
while (i < tokens.length) {
  const statement = nextStatement();
  if (!statement) {
    break;
  }
  ast.body.push(statement);
}
return ast;

先定義ast物件，最頂層的型別為 Program, body為[], 然後依次迴圈tokens的長度，第一步呼叫 nextStatement()方法，在該方法內部，先是
儲存當前的i值，程式碼如下：
// 暫存當前的i，如果無法找到符合條件的情況會需要回到這裡
stash();

然後就是if語句程式碼判斷如下：

if (curToken.type === 'identifier' && curToken.value === 'if') {
  // 解析 if 語句
  const statement = {
    type: 'IfStatement',
  };
  // if 後面必須緊跟著 (
  nextToken();
  if (curToken.type !== 'parens' || curToken.value !== '(') {
    throw new Error('Expected ( after if');
  }

  // 後續的一個表示式是 if 的判斷條件
  statement.test = nextExpression();

  // 判斷條件之後必須是 )
  nextToken();
  if (curToken.type !== 'parens' || curToken.value !== ')') {
    throw new Error('Expected ) after if test expression');
  }

  // 下一個語句是 if 成立時執行的語句
  statement.consequent = nextStatement();

  // 如果下一個符號是 else 就說明還存在 if 不成立時的邏輯
  if (curToken === 'identifier' && curToken.value === 'else') {
    statement.alternative = nextStatement();
  } else {
    statement.alternative = null;
  }
  commit();
  return statement;
}

// 判斷條件之後必須是 )
nextToken();
if (curToken.type !== 'parens' || curToken.value !== ')') {
throw new Error('Expected ) after if test expression');
}
先是呼叫 nextExpression 方法，程式碼如下：

// 讀取下一個表示式
function nextExpression () {
  nextToken();
  if (curToken.type === 'identifier') {
    const identifier = {
      type: 'Identifier',
      name: curToken.value,
    };
    stash();
    nextToken();
    if (curToken.type === 'parens' && curToken.value === '(') {
      // 如果一個識別符號後面緊跟著 ( ，說明是個函式呼叫表示式
      const expr = {
        type: 'CallExpression',
        caller: identifier,
        arguments: [],
      };

      stash();
      nextToken();
      if (curToken.type === 'parens' && curToken.value === ')') {
        // 如果下一個符合直接就是 ) ，說明沒有引數
        commit();
      } else {
        // 讀取函式呼叫引數
        rewind();
        while (i < tokens.length) {
          // 將下一個表示式加到arguments當中
          expr.arguments.push(nextExpression());
          nextToken();
          // 遇到 ) 結束
          if (curToken.type === 'parens' && curToken.value === ')') {
            break;
          }
          // 引數間必須以 , 相間隔
          if (curToken.type !== 'comma' && curToken.value !== ',') {
            throw new Error('Expected , between arguments');
          }
        }
      }
      commit();
      return expr;
    }
    rewind();
    return identifier;
  }
  if (curToken.type === 'number' || curToken.type === 'string') {
    // 數字或字串，說明此處是個常量表示式
    const literal = {
      type: 'Literal',
      value: eval(curToken.value),
    };
    // 但如果下一個符號是運算子，那麼這就是個雙元運算表示式
    stash();
    nextToken();
    if (curToken.type === 'operator') {
      commit();
      return {
        type: 'BinaryExpression',
        left: literal,
        right: nextExpression(),
      };
    }
    rewind();
    return literal;
  }
  if (curToken.type !== 'EOF') {
    throw new Error('Unexpected token ' + curToken.value);
  }
}

在程式碼內部呼叫 nextToken方法，curToken的值變為 var curToken = {type: "number", value: "1"};
所以滿足上面的第二個if條件語句了，所以先定義 literal的值，如下：
const literal = {
type: 'Literal',
value: eval(curToken.value),
};
所以
const literal = {
type: 'Literal',
value: 1,
};
然後呼叫 stash()方法儲存當前的的值；
const stashStack = [];
function stash () {
// 暫存當前位置
stashStack.push(i);
}
因此stashStack的值變為 const stashStack = [-1, 4]; 接著呼叫 nextToken()方法，因此此時的curToken的值變為如下：
var curToken = {type: "operator", value: ">"}; 所以它滿足上面程式碼的 if (curToken.type === 'operator') { 這個條件，
因此會返回
return {
type: 'BinaryExpression',
left: {
type: 'Literal',
value: 1
},
right: nextExpression(),
};
right的值使用遞迴的方式重新呼叫 nextExpression 函式。且在返回之前呼叫了 commit()函式，該函式程式碼如下：
function commit () {
// 解析成功，不需要再返回
stashStack.pop();
}
如上函式使用陣列的pop方法，刪除陣列的最後一個元素，因此此時的 stashStack 的值變為 const stashStack = [-1];
如上程式碼，剛剛i = 4的時候，再呼叫 nextToken()方法，因此此時i就等於6了，遞迴呼叫 nextExpression方法後，再呼叫nextToken();方法，
因此此時 i 的值變為8，因此 curToken的值變為如下；var curToken = {type: "number", value: "0"}; 和上面一樣，還是進入了第二個if
語句程式碼內；此時literal的值變為如下：
const literal = {
type: 'Literal',
value: 0
};
stash(); 呼叫該方法後，因此 var stashStack = [-1, 8]了，再呼叫 nextToken(); 方法後，此時 curToken = {type: "parens", value: ")"}; 下面的if語句不滿足，直接呼叫 rewind()方法; 然後返回 return literal;的值；

rewind方法如下程式碼：
function rewind () {
// 解析失敗，回到上一個暫存的位置
i = stashStack.pop();
curToken = tokens[i];
};
我們之前儲存的stashStack的值為 [-1, 8]; 因此使用pop方法後，或者i的值為8，因此curToken = {type: "number", value: "0"} 了；
最後就返回成這樣的；
return {
type: 'BinaryExpression',
left: {
type: 'Literal',
value: 1
},
right: {
type: 'Literal',
value: 0
}
};
因此 statement.test = {
type: 'BinaryExpression',
left: {
type: 'Literal',
value: 1
},
right: {
type: 'Literal',
value: 0
}
}
我們接著看 nextStatement 語句中的如下程式碼；
// 下一個語句是 if 成立時執行的語句
statement.consequent = nextStatement();
又遞迴呼叫該方法了，因此之前( 的位置是9，因此此時再迴圈呼叫，i的值變為11了，因此 curToken = {type: "brace", value: "{"};
所以就進入了第二個if語句的判斷條件了，如下： if (curToken.type === 'brace' && curToken.value === '{') {
先定義statement的值如下：
// 以 { 開頭表示是個程式碼塊
const statement = {
type: 'BlockStatement',
body: [],
};
while (i < tokens.length) {
// 檢查下一個符號是不是 }
stash();
nextToken();
if (curToken.type === 'brace' && curToken.value === '}') {
// } 表示程式碼塊的結尾
commit();
break;
}
// 還原到原來的位置，並將解析的下一個語句加到body
rewind();
statement.body.push(nextStatement());
}
// 程式碼塊語句解析完畢，返回結果
commit();
return statement;

程式碼如上，此時i = 11; 進入while迴圈語句了，呼叫 stash儲存當前的值因此 var stashStack = [-1, 11]; 呼叫 nextToken方法後，那麼
curToken = {type: "identifier", value: "alert"}; while程式碼不滿足要求，因此呼叫 rewind()方法返回到 i = 11位置上了，然後繼續
呼叫nextStatement方法，把返回後的結果放入 statement.body陣列內，呼叫 nextToken(); 方法後，回到13位置上了，因此此時
var curToken = {type: "identifier", value: "alert"}; 上面的if條件語句都不滿足，所以定義如下變數了。
// 嘗試解析單表示式語句
const statement = {
type: 'ExpressionStatement',
expression: nextExpression(),
};
呼叫 nextExpression 該方法，該方法如下：

function nextExpression () {
  nextToken();
  if (curToken.type === 'identifier') {
    const identifier = {
      type: 'Identifier',
      name: curToken.value,
    };
    stash();
    nextToken();
    if (curToken.type === 'parens' && curToken.value === '(') {
      // 如果一個識別符號後面緊跟著 ( ，說明是個函式呼叫表示式
      const expr = {
        type: 'CallExpression',
        caller: identifier,
        arguments: [],
      };

      stash();
      nextToken();
      if (curToken.type === 'parens' && curToken.value === ')') {
        // 如果下一個符合直接就是 ) ，說明沒有引數
        commit();
      } else {
        // 讀取函式呼叫引數
        rewind();
        while (i < tokens.length) {
          // 將下一個表示式加到arguments當中
          expr.arguments.push(nextExpression());
          nextToken();
          // 遇到 ) 結束
          if (curToken.type === 'parens' && curToken.value === ')') {
            break;
          }
          // 引數間必須以 , 相間隔
          if (curToken.type !== 'comma' && curToken.value !== ',') {
            throw new Error('Expected , between arguments');
          }
        }
      }
      commit();
      return expr;
    }
    rewind();
    return identifier;
  }
  if (curToken.type === 'number' || curToken.type === 'string') {
    // 數字或字串，說明此處是個常量表示式
    const literal = {
      type: 'Literal',
      value: eval(curToken.value),
    };
    // 但如果下一個符號是運算子，那麼這就是個雙元運算表示式
    stash();
    nextToken();
    if (curToken.type === 'operator') {
      commit();
      return {
        type: 'BinaryExpression',
        left: literal,
        right: nextExpression(),
      };
    }
    rewind();
    return literal;
  }
  if (curToken.type !== 'EOF') {
    throw new Error('Unexpected token ' + curToken.value);
  }
}

如上 curToken的值 curToken = {type: "identifier", value: "alert"}; 因此會進入第一個if語句內，identifier的值變為如下：
const identifier = {
type: 'Identifier',
name: alert,
};
呼叫 stash()方法，此時 stashStack 的值變為 var stashStack = [-1, 13]; 再接著呼叫 nextToken方法，因此curToken的值變為如下：
var curToken = {type: "parens", value: "("},因此會進入if條件語句了，如下：
if (curToken.type === 'parens' && curToken.value === '(') {; 的條件判斷了；接著定義expr的變數如下程式碼：
// 如果一個識別符號後面緊跟著 ( ，說明是個函式呼叫表示式
const expr = {
type: 'CallExpression',
caller: identifier,
arguments: [],
};
再呼叫該方法後，stash(); 此時 stashStack的值變為 [-1, 14], 再呼叫 nextToken(); 方法後，此時 curToken的值變為如下：
var curToken = {type: "string", value: "'aa'"}; 再接著執行 if (curToken.type === 'parens' && curToken.value === '(')
程式碼麼有找到條件判斷，因此在呼叫 rewind(); 返回再返回14的位置上，此時 curToken = {type: "parens", value: "("};
因此執行後，緊著如下程式碼：
// 讀取函式呼叫引數
rewind();
while (i < tokens.length) {
// 將下一個表示式加到arguments當中
expr.arguments.push(nextExpression());
nextToken();
// 遇到 ) 結束
if (curToken.type === 'parens' && curToken.value === ')') {
break;
}
// 引數間必須以 , 相間隔
if (curToken.type !== 'comma' && curToken.value !== ',') {
throw new Error('Expected , between arguments');
}
}
, 原理還是和上面一樣，這裡不一一解析了，太煩了；大家可以自己去理解了。

以上就是語義解析的部分主要思路。

詳情可以看github官網的程式碼中文翻譯的

github控制檯檢視效果

通過開發 Babel 外掛來理解什麼是抽象語法樹（AST）
2019-06-24
Babel抽象語法樹AST
javascript編寫一個簡單的編譯器(理解抽象語法樹AST)
2017-10-31
JavaScript編譯抽象語法樹AST
babel外掛入門-AST（抽象語法樹）
2018-03-27
BabelAST抽象語法樹
從Babel開始認識AST抽象語法樹
2023-01-10
BabelAST抽象語法樹
AST抽象語法樹
2018-11-29
AST抽象語法樹
[譯]理解AST構建Babel外掛
2018-05-27
ASTBabel
Javascrip—AST抽象語法樹（8）
2018-10-19
JavaAST抽象語法樹
JavaScript的工作原理：解析、抽象語法樹（AST）+ 提升編譯速度5個技巧
2019-01-22
JavaScript抽象語法樹AST編譯
《8分鐘學會 Vue.js 原理》：一、template 字串編譯為抽象語法樹 AST
2022-04-12
Vue.js字串編譯抽象語法樹AST
前端碼農之蛻變 — AST（抽象語法樹）
2018-11-29
前端AST抽象語法樹
「譯」什麼是抽象語法樹
2019-06-16
抽象語法樹
[原始碼-webpack01-前置知識] AST抽象語法樹
2020-04-05
原始碼WebAST抽象語法樹
高階前端基礎-JavaScript抽象語法樹AST
2019-03-16
前端JavaScript抽象語法樹AST
以 Golang 為例詳解 AST 抽象語法樹
2024-01-17
GolangAST抽象語法樹
babel 修改抽象語法樹——入門與實踐
2018-10-08
Babel抽象語法樹
Go編譯原理系列5（抽象語法樹構建）
2022-01-15
Go編譯原理抽象語法樹
Cursor語法及理解
2016-06-08
從AST編譯解析談到寫babel外掛
2018-07-24
AST編譯Babel
前端進階之 JS 抽象語法樹
2019-03-02
前端JS抽象語法樹
藉助babel理解jsx
2019-01-21
BabelJS
一看就懂的JS抽象語法樹
2017-12-11
JS抽象語法樹
Go 抽象語法樹
2021-10-08
Go抽象語法樹
如何透過babel去操作ast, 並生成對應的程式碼。
2022-06-26
BabelAST
Javascript與抽象語法樹
2019-03-01
JavaScript抽象語法樹
13 個示例快速入門 JS 抽象語法樹
2019-03-03
JS抽象語法樹
前端Javascript: Babel 怎麼把字串解析成 AST，是怎麼進行詞法/語法分析的？
2019-11-12
前端JavaScriptBabel字串AST語法分析
Babel：下一代Javascript語法編譯器
2020-07-16
BabelJavaScript編譯
我是如何理解Java抽象類和介面的
2015-03-25
Java抽象
babel是如何編譯es6 class和extends的
2019-03-11
Babel編譯
babel 與 ast
2020-12-19
BabelAST
抽象語法樹 Abstract syntax tree
2018-03-26
抽象語法樹
SQL抽象語法樹及改寫場景應用
2022-10-09
SQL抽象語法樹
SQL 抽象語法樹及改寫場景應用
2022-10-19
SQL抽象語法樹
ES6 系列之 Babel 是如何編譯 Class 的(上)
2021-09-09
Babel編譯
ES6 系列之 Babel 是如何編譯 Class 的(下)
2018-11-07
Babel編譯
ES6系列之Babel是如何編譯Class的(上)
2018-11-06
Babel編譯
對預編譯的理解
2020-11-21
編譯
AST語法結構樹初學者完整教程
2017-06-26
AST

理解Babel是如何編譯JS程式碼的及理解抽象語法樹(AST）

相關文章