帶你揭開神秘的Javascript AST面紗之Babel AST 四件套的使用方法

作者：京東零售周明亮

寫在前面

這裡我們初步提到了一些基礎概念和應用：

分析器
抽象語法樹 AST
AST 在 JS 中的用途
AST 的應用實踐

有了初步的認識，還有常規的程式碼改造應用實踐，現在我們來詳細說說使用 AST，如何進行程式碼改造？

Babel AST 四件套的使用方法

其實在解析 AST 這個工具上，有很多可以使用，上文我們已經提到過了。對於 JS 的 AST 大家已經形成了統一的規範命名，唯一不同的可能是，不同工具提供的詳細程度不一樣，有的可能會額外提供額外方法或者屬性。

所以，在選擇工具上，大家按照各自喜歡選擇即可，這裡我們選擇了babel這個老朋友。

初識 Babel

我相信在這個前端框架頻出的時代，應該都知道babel的存在。如果你還沒聽說過babel，那麼我們透過它的相關檔案，繼續深入學習一下。

因為，它在任何框架裡面，我們都能看到它的影子。

Babel JS 官網
Babel JS Github

作為使用最廣泛的 JS 編譯器，他可以用於將採用 ECMAScript 2015+ 語法編寫的程式碼轉換為向後相容的 JavaScript 語法，以便能夠執行在當前和舊版本的瀏覽器或其他環境中。

而它能夠做到向下相容或者程式碼轉換，就是基於程式碼解析和改造。接下來，我們來說說：如何使用@babel/core裡面的核心四件套：@babel/parser、@babel/traverse、@babel/types及@babel/generator。

1. @babel/parser

@babel/parser 核心程式碼解析器，透過它進行詞法分析及語法分析過程，最終轉換為我們提到的 AST 形式。

假設我們需要讀取React中index.tsx檔案中程式碼內容，我們可以使用如下程式碼：

const { parse } = require("@babel/parser")

// 讀取檔案內容
const fileBuffer = fs.readFileSync('./code/app/index.tsx', 'utf8');
// 轉換位元組 Buffer
const fileCode = fileBuffer.toString();
// 解析內容轉換為 AST 物件
const codeAST = parse(fileCode, {
  // parse in strict mode and allow module declarations
  sourceType: "module",
  plugins: [
    // enable jsx and typescript syntax
    "jsx",
    "typescript",
  ],
});

當然我不僅僅只讀取React程式碼，我們甚至可以讀取Vue語法。它也有對應的語法分析器，比如：@vue/compiler-dom。

此外，透過不同的引數傳入 options，我們可以解析各種各樣的程式碼。如果，我們只是讀取普通的.js檔案，我們可以不使用任何外掛屬性即可。

const codeAST = parse(fileCode, {
  // parse in strict mode and allow module declarations
  sourceType: "module"
});

透過上述的程式碼轉換，我們就可以得到一個標準的 AST 物件。在上一篇文章中，已做詳細分析，在這裡不在展開。比如：

// 原始碼
const me = "我"
function write() {
  console.log("文章")
}

// 轉換後的 AST 物件
const codeAST = {
  "type": "File",
  "errors": [],
  "program": {
    "type": "Program",
    "sourceType": "module",
    "interpreter": null,
    "body": [
      {
        "type": "VariableDeclaration",
        "declarations": [
          {
            "type": "VariableDeclarator",
            "id": {
              "type": "Identifier",
              "name": "me"
            },
            "init": {
              "type": "StringLiteral",
              "extra": {
                "rawValue": "我",
                "raw": "\"我\""
              },
              "value": "我"
            }
          }
        ],
        "kind": "const"
      },
      {
        "type": "FunctionDeclaration",
        "id": {
          "type": "Identifier",
          "name": "write"
        },
        "generator": false,
        "async": false,
        "params": [],
        "body": {
          "type": "BlockStatement",
          "body": [
            {
              "type": "ExpressionStatement",
              "expression": {
                "type": "CallExpression",
                "callee": {
                  "type": "MemberExpression",
                  "object": {
                    "type": "Identifier",
                    "computed": false,
                    "property": {
                      "type": "Identifier",
                      "name": "log"
                    }
                  },
                  "arguments": [
                    {
                      "type": "StringLiteral",
                      "extra": {
                        "rawValue": "文章",
                        "raw": "\"文章\""
                      },
                      "value": "文章"
                    }
                  ]
                }
              }
            }
          ]
        }
      }
    ]
  }
}

2. @babel/traverse

當我們拿到一個標準的 AST 物件後，我們要操作它，那肯定是需要進行樹結構遍歷。這時候，我們就會用到 @babel/traverse 。

比如我們得到 AST 後，我們可以進行遍歷操作：

const { default: traverse } = require('@babel/traverse');

// 進入結點
const onEnter = pt => {
   // 進入當前結點操作
   console.log(pt)
}
// 退出結點
const onExit = pe => {
  // 退出當前結點操作
}
traverse(codeAST, { enter: onEnter, exit: onExit })

那麼我們訪問的第一個結點，列印出pt的值，是怎樣的呢？

// 已省略部分無效值
<ref *1> NodePath {
  contexts: [
    TraversalContext {
      queue: [Array],
      priorityQueue: [],
      ...
    }
  ],
  state: undefined,
  opts: {
    enter: [ [Function: onStartVist] ],
    exit: [ [Function: onEndVist] ],
    _exploded: true,
    _verified: true
  },
  _traverseFlags: 0,
  skipKeys: null,
  parentPath: null,
  container: Node {
    type: 'File',
    errors: [],
    program: Node {
      type: 'Program',
      sourceType: 'module',
      interpreter: null,
      body: [Array],
      directives: []
    },
    comments: []
  },
  listKey: undefined,
  key: 'program',
  node: Node {
    type: 'Program',
    sourceType: 'module',
    interpreter: null,
    body: [ [Node], [Node] ],
    directives: []
  },
  type: 'Program',
  parent: Node {
    type: 'File',
    errors: [],
    program: Node {
      type: 'Program',
      sourceType: 'module',
      interpreter: null,
      body: [Array],
      directives: []
    },
    comments: []
  },
  hub: undefined,
  data: null,
  context: TraversalContext {
    queue: [ [Circular *1] ],
    priorityQueue: [],
    ...
  },
  scope: Scope {
    uid: 0,
    path: [Circular *1],
    block: Node {
      type: 'Program',
      sourceType: 'module',
      interpreter: null,
      body: [Array],
      directives: []
    },
    ...
  }
}

是不是發現，這一個遍歷怎麼這麼多東西？太長了，那麼我們進行省略，只看關鍵部分：

// 第1次
<ref *1> NodePath {
  listKey: undefined,
  key: 'program',
  node: Node {
    type: 'Program',
    sourceType: 'module',
    interpreter: null,
    body: [ [Node], [Node] ],
    directives: []
  },
  type: 'Program',
}

我們可以看出是直接進入到了程式program結點。對應的 AST 結點資訊：

  program: {
    type: 'Program',
    sourceType: 'module',
    interpreter: null,
    body: [
      [Node]
      [Node]
    ],
  },

接下來，我們繼續列印輸出的結點資訊，我們可以看出它訪問的是program.body結點。

// 第2次
<ref *2> NodePath {
  listKey: 'body',
  key: 0,
  node: Node {
    type: 'VariableDeclaration',
    declarations: [ [Node] ],
    kind: 'const'
  },
  type: 'VariableDeclaration',
}

// 第3次
<ref *1> NodePath {
  listKey: 'declarations',
  key: 0,
  node: Node {
    type: 'VariableDeclarator',
    id: Node {
      type: 'Identifier',
      name: 'me'
    },
    init: Node {
      type: 'StringLiteral',
      extra: [Object],
      value: '我'
    }
  },
  type: 'VariableDeclarator',
}

// 第4次
<ref *1> NodePath {
  listKey: undefined,
  key: 'id',
  node: Node {
    type: 'Identifier',
    name: 'me'
  },
  type: 'Identifier',
}

// 第5次
<ref *1> NodePath {
  listKey: undefined,
  key: 'init',
  node: Node {
    type: 'StringLiteral',
    extra: { rawValue: '我', raw: "'我'" },
    value: '我'
  },
  type: 'StringLiteral',
}

node當前結點
parentPath父結點路徑
scope作用域
parent父結點
type當前結點型別

現在我們可以看出這個訪問的規律了，他會一直找當前結點node屬性，然後進行層層訪問其內容，直到將 AST 的所有結點遍歷完成。

這裡一定要區分NodePath和Node兩種型別，比如上面：pt是屬於NodePath型別，pt.node才是Node型別。

其次，我們看到提供的方法除了進入 [enter]還有退出 [exit]方法，這也就意味著，每次遍歷一次結點資訊，也會退出當前結點。這樣，我們就有兩次機會獲得所有的結點資訊。

當我們遍歷結束，如果找不到對應的結點資訊，我們還可以進行額外的操作，進行程式碼結點補充操作。結點完整訪問流程如下：

進入>Program
- 進入>node.body[0]
  - 進入>node.declarations[0]
    - 進入>node.id
    - 退出<node.id
    - 進入>node.init
    - 退出<node.init
  - 退出<node.declarations[0]
- 退出<node.body[0]
- 進入>node.body[1]
  - ...
  - ...
- 退出<node.body[1]
退出<Program

3. @babel/types

有了前面的鋪墊，我們透過解析，獲得了相關的 AST 物件。透過不斷遍歷，我們拿到了相關的結點，這時候我們就可以開始改造了。@babel/types 就提供了一系列的判斷方法，以及將普通物件轉換為 AST 結點的方法。

比如，我們想把程式碼轉換為：

// 改造前程式碼
const me = "我"
function write() {
  console.log("文章")
}

// 改造後的程式碼
let you = "你"
function write() {
  console.log("文章")
}

首先，我們要分析下，這個程式碼改了哪些內容？

變數宣告從const改為let
變數名從me改為you
變數值從"我"改為"你"

那麼我們有兩種替換方式：

方案一：整體替換，相當於把program.body[0]整個結點進行替換為新的結點。
方案二：區域性替換，相當於逐個結點替換結點內容，即：program.body.kind,program.body[0].declarations[0].id，program.body[0].declarations[0].init。

藉助@babel/types我們可以這麼操作，一起看看區別：

const bbt = require('@babel/types');
const { default: traverse } = require('@babel/traverse');

// 進入結點
const onEnter = p => {
  // 方案一，全結點替換
  if (bbt.isVariableDeclaration(p.node) && p.listKey == 'body') {
    // 直接替換為新的結點
    p.replaceWith(
      bbt.variableDeclaration('let', [
        bbt.variableDeclarator(bbt.identifier('you'),           
        bbt.stringLiteral('你')),
      ]),
    );
  }
  // 方案二，單結點逐一替換
  if (bbt.isVariableDeclaration(p.node) && p.listKey == 'body') {
    // 替換宣告變數方式
    p.node.kind = 'let';
  }
  if (bbt.isIdentifier(p.node) && p.node.name == 'me') {
    // 替換變數名
    p.node.name = 'you';
  }
  if (bbt.isStringLiteral(p.node) && p.node.value == '我') {
    // 替換字串內容
    p.node.value = '你';
  }  
};
traverse(codeAST, { enter: onEnter });

我們發現，不僅可以進行整體結點替換，也可以替換屬性的值，都能達到預期效果。

當然我們不僅僅可以全部遍歷，我們也可以只遍歷某些屬性，比如VariableDeclaration，我們就可以這樣進行定義:

traverse(codeAST, { 
  VariableDeclaration: function(p) {
    // 只操作型別為 VariableDeclaration 的結點
    p.node.kind = 'let';
  }
});

@babel/types提供大量的方法供使用，可以透過官網檢視。對於@babel/traverse返回的可用方法，可以檢視 ts 定義：
babel__traverse/index.d.ts 檔案。

常用的方法：p.stop()可以提前終止內容遍歷，還有其他的增刪改查方法，可以自己慢慢摸索使用！它就是一個樹結構，我們可以操作它的兄弟結點，父節點，子結點。

4. @babel/generator

完成改造以後，我們需要把 AST 再轉換回去，這時候我們就需要用到 @babel/generator 工具。只拆不組裝，那是二哈【狗頭】。能裝能組，才是一個完整工程師該乾的事情。

廢話不多說，上程式碼：

const fs = require('fs-extra');
const { default: generate } = require('@babel/generator');

// 生成程式碼例項
const codeIns = generate(codeAST, { retainLines: true, jsescOption: { minimal: true } });

// 寫入檔案內容
fs.writeFileSync('./code/app/index.js', codeIns.code);

配置項比較多，大家可以參考具體的說明，按照實際需求進行配置。

這裡特別提一下：jsescOption: { minimal: true }這個屬性，主要是用來保留中文內容，防止被轉為unicode形式。

Babel AST 實踐

嘿嘿～都到這裡了，大家應該已經能夠上手操作了吧！

什麼？還不會，那再把 1 ～ 4 的步驟再看一遍。慢慢嘗試，慢慢修改，當你發現其中的樂趣時，這個 AST 的改造也就簡單了，並不是什麼難事。

留個課後練習：

// 改造前程式碼
const me = "我"
function write() {
  console.log("文章")
}

// 改造後的程式碼
const you = "你"
function write() {
  console.log("文章")
}
console.log(you, write())

大家可以去嘗試下，怎麼操作簡單的 AST 實現程式碼改造！寫文章不易，大家記得一鍵三連哈～

AST 應用是非常廣泛，再來回憶下，這個 AST 可以幹嘛？

程式碼轉換領域，如：ES6 轉 ES5， typescript 轉 js，Taro 轉多端編譯，CSS前處理器等等。
模版編譯領域，如：React JSX 語法，Vue 模版語法等等。
程式碼預處理領域，如：程式碼語法檢查（ESLint），程式碼格式化（Prettier），程式碼混淆/壓縮（uglifyjs）等等
低程式碼搭建平臺，拖拽元件，直接透過 AST 改造生成後的程式碼進行執行。

下一期預告

《帶你揭開神秘的Javascript AST面紗之手寫一個簡單的 Javascript 編譯器》