前端與編譯原理——用JS寫一個JS直譯器

發表於2018-12-12

圖片描述

說起編譯原理，印象往往只停留在本科時那些枯燥的課程和晦澀的概念。作為前端開發者，編譯原理似乎離我們很遠，對它的理解很可能僅僅侷限於“抽象語法樹（AST）”。但這僅僅是個開頭而已。編譯原理的使用，甚至能讓我們利用JS直接寫一個能執行JS程式碼的直譯器。

專案地址：https://github.com/jrainlau/c…

線上體驗：https://codepen.io/jrainlau/p…

一、為什麼要用JS寫JS的直譯器

接觸過小程式開發的同學應該知道，小程式執行的環境禁止new Function，eval等方法的使用，導致我們無法直接執行字串形式的動態程式碼。此外，許多平臺也對這些JS自帶的可執行動態程式碼的方法進行了限制，那麼我們是沒有任何辦法了嗎？既然如此，我們便可以用JS寫一個解析器，讓JS自己去執行自己。

在開始之前，我們先簡單回顧一下編譯原理的一些概念。

二、什麼是編譯器

說到編譯原理，肯定離不開編譯器。簡單來說，當一段程式碼經過編譯器的詞法分析、語法分析等階段之後，會生成一個樹狀結構的“抽象語法樹（AST）”，該語法樹的每一個節點都對應著程式碼當中不同含義的片段。

比如有這麼一段程式碼：

const a = 1
console.log(a)

1 2	const a = 1 console.log(a)

經過編譯器處理後，它的AST長這樣：

{
  "type": "Program",
  "start": 0,
  "end": 26,
  "body": [
    {
      "type": "VariableDeclaration",
      "start": 0,
      "end": 11,
      "declarations": [
        {
          "type": "VariableDeclarator",
          "start": 6,
          "end": 11,
          "id": {
            "type": "Identifier",
            "start": 6,
            "end": 7,
            "name": "a"
          },
          "init": {
            "type": "Literal",
            "start": 10,
            "end": 11,
            "value": 1,
            "raw": "1"
          }
        }
      ],
      "kind": "const"
    },
    {
      "type": "ExpressionStatement",
      "start": 12,
      "end": 26,
      "expression": {
        "type": "CallExpression",
        "start": 12,
        "end": 26,
        "callee": {
          "type": "MemberExpression",
          "start": 12,
          "end": 23,
          "object": {
            "type": "Identifier",
            "start": 12,
            "end": 19,
            "name": "console"
          },
          "property": {
            "type": "Identifier",
            "start": 20,
            "end": 23,
            "name": "log"
          },
          "computed": false
        },
        "arguments": [
          {
            "type": "Identifier",
            "start": 24,
            "end": 25,
            "name": "a"
          }
        ]
      }
    }
  ],
  "sourceType": "module"
}

{

"type": "Program",

"start": 0,

"end": 26,

"body": [

{

"type": "VariableDeclaration",

"start": 0,

"end": 11,

"declarations": [

{

"type": "VariableDeclarator",

"start": 6,

"end": 11,

"id": {

"type": "Identifier",

"start": 6,

"end": 7,

"name": "a"

"init": {

"type": "Literal",

"start": 10,

"end": 11,

"value": 1,

"raw": "1"

}

"kind": "const"

{

"type": "ExpressionStatement",

"start": 12,

"end": 26,

"expression": {

"type": "CallExpression",

"start": 12,

"end": 26,

"callee": {

"type": "MemberExpression",

"start": 12,

"end": 23,

"object": {

"type": "Identifier",

"start": 12,

"end": 19,

"name": "console"

"property": {

"type": "Identifier",

"start": 20,

"end": 23,

"name": "log"

"computed": false

"arguments": [

{

"type": "Identifier",

"start": 24,

"end": 25,

"name": "a"

}

]

}

"sourceType": "module"

}

常見的JS編譯器有babylon，acorn等等，感興趣的同學可以在AST explorer這個網站自行體驗。

可以看到，編譯出來的AST詳細記錄了程式碼中所有語義程式碼的型別、起始位置等資訊。這段程式碼除了根節點Program外，主體包含了兩個節點VariableDeclaration和ExpressionStatement，而這些節點裡面又包含了不同的子節點。

正是由於AST詳細記錄了程式碼的語義化資訊，所以Babel，Webpack，Sass，Less等工具可以針對程式碼進行非常智慧的處理。

三、什麼是直譯器

如同翻譯人員不僅能看懂一門外語，也能對其藝術加工後把它翻譯成母語一樣，人們把能夠將程式碼轉化成AST的工具叫做“編譯器”，而把能夠將AST翻譯成目標語言並執行的工具叫做“直譯器”。

在編譯原理的課程中，我們思考過這麼一個問題：如何讓計算機執行算數表示式1+2+3:

1 + 2 + 3

1 + 2 + 3

當機器執行的時候，它可能會是這樣的機器碼：

1 PUSH 1
2 PUSH 2
3 ADD
4 PUSH 3
5 ADD

1 PUSH 1

2 PUSH 2

3 ADD

4 PUSH 3

5 ADD

而執行這段機器碼的程式，就是直譯器。

在這篇文章中，我們不會搞出機器碼這樣複雜的東西，僅僅是使用JS在其runtime環境下去解釋JS程式碼的AST。由於直譯器使用JS編寫，所以我們可以大膽使用JS自身的語言特性，比如this繫結、new關鍵字等等，完全不需要對它們進行額外處理，也因此讓JS直譯器的實現變得非常簡單。

在回顧了編譯原理的基本概念之後，我們就可以著手進行開發了。

四、節點遍歷器

通過分析上文的AST，可以看到每一個節點都會有一個型別屬性type，不同型別的節點需要不同的處理方式，處理這些節點的程式，就是“節點處理器（nodeHandler）”

定義一個節點處理器：

const nodeHandler = {
  Program () {},
  VariableDeclaration () {},
  ExpressionStatement () {},
  MemberExpression () {},
  CallExpression () {},
  Identifier () {}
}

const nodeHandler = {

Program () {},

VariableDeclaration () {},

ExpressionStatement () {},

MemberExpression () {},

CallExpression () {},

Identifier () {}

}

關於節點處理器的具體實現，會在後文進行詳細探討，這裡暫時不作展開。

有了節點處理器，我們便需要去遍歷AST當中的每一個節點，遞迴地呼叫節點處理器，直到完成對整棵語法書的處理。

定義一個節點遍歷器（NodeIterator）：

class NodeIterator {
  constructor (node) {
    this.node = node
    this.nodeHandler = nodeHandler
  }

  traverse (node) {
    // 根據節點型別找到節點處理器當中對應的函式
    const _eval = this.nodeHandler[node.type]
    // 若找不到則報錯
    if (!_eval) {
      throw new Error(`canjs: Unknown node type "${node.type}".`)
    }
    // 執行處理函式
    return _eval(node)
  }

}

class NodeIterator {

constructor (node) {

this.node = node

this.nodeHandler = nodeHandler

}

traverse (node) {

// 根據節點型別找到節點處理器當中對應的函式

const _eval = this.nodeHandler[node.type]

// 若找不到則報錯

if (!_eval) {

throw new Error(`canjs: Unknown node type "${node.type}".`)

}

// 執行處理函式

return _eval(node)

}

理論上，節點遍歷器這樣設計就可以了，但仔細推敲，發現漏了一個很重要的東西——作用域處理。

回到節點處理器的VariableDeclaration()方法，它用來處理諸如const a = 1這樣的變數宣告節點。假設它的程式碼如下：

  VariableDeclaration (node) {
    for (const declaration of node.declarations) {
      const { name } = declaration.id
      const value = declaration.init ? traverse(declaration.init) : undefined
      // 問題來了，拿到了變數的名稱和值，然後把它儲存到哪裡去呢？
      // ...
    }
  },

VariableDeclaration (node) {

for (const declaration of node.declarations) {

const { name } = declaration.id

const value = declaration.init ? traverse(declaration.init) : undefined

// 問題來了，拿到了變數的名稱和值，然後把它儲存到哪裡去呢？

// ...

}

問題在於，處理完變數宣告節點以後，理應把這個變數儲存起來。按照JS語言特性，這個變數應該存放在一個作用域當中。在JS解析器的實現過程中，這個作用域可以被定義為一個scope物件。

改寫節點遍歷器，為其新增一個scope物件

class NodeIterator {
  constructor (node, scope = {}) {
    this.node = node
    this.scope = scope
    this.nodeHandler = nodeHandler
  }

  traverse (node, options = {}) {
    const scope = options.scope || this.scope
    const nodeIterator = new NodeIterator(node, scope)
    const _eval = this.nodeHandler[node.type]
    if (!_eval) {
      throw new Error(`canjs: Unknown node type "${node.type}".`)
    }
    return _eval(nodeIterator)
  }

  createScope (blockType = 'block') {
    return new Scope(blockType, this.scope)
  }
}

class NodeIterator {

constructor (node, scope = {}) {

this.node = node

this.scope = scope

this.nodeHandler = nodeHandler

}

traverse (node, options = {}) {

const scope = options.scope || this.scope

const nodeIterator = new NodeIterator(node, scope)

const _eval = this.nodeHandler[node.type]

if (!_eval) {

throw new Error(`canjs: Unknown node type "${node.type}".`)

}

return _eval(nodeIterator)

}

createScope (blockType = 'block') {

return new Scope(blockType, this.scope)

}

然後節點處理函式VariableDeclaration()就可以通過scope儲存變數了：

  VariableDeclaration (nodeIterator) {
    const kind = nodeIterator.node.kind
    for (const declaration of nodeIterator.node.declarations) {
      const { name } = declaration.id
      const value = declaration.init ? nodeIterator.traverse(declaration.init) : undefined
      // 在作用域當中定義變數
      // 如果當前是塊級作用域且變數用var定義，則定義到父級作用域
      if (nodeIterator.scope.type === 'block' && kind === 'var') {
        nodeIterator.scope.parentScope.declare(name, value, kind)
      } else {
        nodeIterator.scope.declare(name, value, kind)
      }
    }
  },

VariableDeclaration (nodeIterator) {

const kind = nodeIterator.node.kind

for (const declaration of nodeIterator.node.declarations) {

const { name } = declaration.id

const value = declaration.init ? nodeIterator.traverse(declaration.init) : undefined

// 在作用域當中定義變數

// 如果當前是塊級作用域且變數用var定義，則定義到父級作用域

if (nodeIterator.scope.type === 'block' && kind === 'var') {

nodeIterator.scope.parentScope.declare(name, value, kind)

} else {

nodeIterator.scope.declare(name, value, kind)

}

關於作用域的處理，可以說是整個JS直譯器最難的部分。接下來我們將對作用域處理進行深入的剖析。

五、作用域處理

考慮到這樣一種情況：

const a = 1
{
  const b = 2
  console.log(a)
}
console.log(b)

const a = 1

{

const b = 2

console.log(a)

}

console.log(b)

執行結果必然是能夠列印出a的值，然後報錯：Uncaught ReferenceError: b is not defined

這段程式碼就是涉及到了作用域的問題。塊級作用域或者函式作用域可以讀取其父級作用域當中的變數，反之則不行，所以對於作用域我們不能簡單地定義一個空物件，而是要專門進行處理。

定義一個作用域基類Scope：

class Scope {
  constructor (type, parentScope) {
    // 作用域型別，區分函式作用域function和塊級作用域block
    this.type = type
    // 父級作用域
    this.parentScope = parentScope
    // 全域性作用域
    this.globalDeclaration = standardMap
    // 當前作用域的變數空間
    this.declaration = Object.create(null)
  }

  /*
   * get/set方法用於獲取/設定當前作用域中對應name的變數值
     符合JS語法規則，優先從當前作用域去找，若找不到則到父級作用域去找，然後到全域性作用域找。
     如果都沒有，就報錯
   */
  get (name) {
    if (this.declaration[name]) {
      return this.declaration[name]
    } else if (this.parentScope) {
      return this.parentScope.get(name)
    } else if (this.globalDeclaration[name]) {
      return this.globalDeclaration[name]
    }
    throw new ReferenceError(`${name} is not defined`)
  }

  set (name, value) {
    if (this.declaration[name]) {
      this.declaration[name] = value
    } else if (this.parentScope[name]) {
      this.parentScope.set(name, value)
    } else {
      throw new ReferenceError(`${name} is not defined`)
    }
  }

  /**
   * 根據變數的kind呼叫不同的變數定義方法
   */
  declare (name, value, kind = 'var') {
    if (kind === 'var') {
      return this.varDeclare(name, value)
    } else if (kind === 'let') {
      return this.letDeclare(name, value)
    } else if (kind === 'const') {
      return this.constDeclare(name, value)
    } else {
      throw new Error(`canjs: Invalid Variable Declaration Kind of "${kind}"`)
    }
  }

  varDeclare (name, value) {
    let scope = this
    // 若當前作用域存在非函式型別的父級作用域時，就把變數定義到父級作用域
    while (scope.parentScope && scope.type !== 'function') {
      scope = scope.parentScope
    }
    this.declaration[name] = new SimpleValue(value, 'var')
    return this.declaration[name]
  }

  letDeclare (name, value) {
    // 不允許重複定義
    if (this.declaration[name]) {
      throw new SyntaxError(`Identifier ${name} has already been declared`)
    }
    this.declaration[name] = new SimpleValue(value, 'let')
    return this.declaration[name]
  }

  constDeclare (name, value) {
    // 不允許重複定義
    if (this.declaration[name]) {
      throw new SyntaxError(`Identifier ${name} has already been declared`)
    }
    this.declaration[name] = new SimpleValue(value, 'const')
    return this.declaration[name]
  }
}

class Scope {

constructor (type, parentScope) {

// 作用域型別，區分函式作用域function和塊級作用域block

this.type = type

// 父級作用域

this.parentScope = parentScope

// 全域性作用域

this.globalDeclaration = standardMap

// 當前作用域的變數空間

this.declaration = Object.create(null)

}

* get/set方法用於獲取/設定當前作用域中對應name的變數值

符合JS語法規則，優先從當前作用域去找，若找不到則到父級作用域去找，然後到全域性作用域找。

如果都沒有，就報錯

get (name) {

if (this.declaration[name]) {

return this.declaration[name]

} else if (this.parentScope) {

return this.parentScope.get(name)

} else if (this.globalDeclaration[name]) {

return this.globalDeclaration[name]

}

throw new ReferenceError(`${name} is not defined`)

}

set (name, value) {

if (this.declaration[name]) {

this.declaration[name] = value

} else if (this.parentScope[name]) {

this.parentScope.set(name, value)

} else {

throw new ReferenceError(`${name} is not defined`)

}

/**

* 根據變數的kind呼叫不同的變數定義方法

declare (name, value, kind = 'var') {

if (kind === 'var') {

return this.varDeclare(name, value)

} else if (kind === 'let') {

return this.letDeclare(name, value)

} else if (kind === 'const') {

return this.constDeclare(name, value)

} else {

throw new Error(`canjs: Invalid Variable Declaration Kind of "${kind}"`)

}

varDeclare (name, value) {

let scope = this

// 若當前作用域存在非函式型別的父級作用域時，就把變數定義到父級作用域

while (scope.parentScope && scope.type !== 'function') {

scope = scope.parentScope

}

this.declaration[name] = new SimpleValue(value, 'var')

return this.declaration[name]

}

letDeclare (name, value) {

// 不允許重複定義

if (this.declaration[name]) {

throw new SyntaxError(`Identifier ${name} has already been declared`)

}

this.declaration[name] = new SimpleValue(value, 'let')

return this.declaration[name]

}

constDeclare (name, value) {

// 不允許重複定義

if (this.declaration[name]) {

throw new SyntaxError(`Identifier ${name} has already been declared`)

}

this.declaration[name] = new SimpleValue(value, 'const')

return this.declaration[name]

}

這裡使用了一個叫做simpleValue()的函式來定義變數值，主要用於處理常量：

class SimpleValue {
  constructor (value, kind = '') {
    this.value = value
    this.kind = kind
  }

  set (value) {
    // 禁止重新對const型別變數賦值
    if (this.kind === 'const') {
      throw new TypeError('Assignment to constant variable')
    } else {
      this.value = value
    }
  }

  get () {
    return this.value
  }
}

class SimpleValue {

constructor (value, kind = '') {

this.value = value

this.kind = kind

}

set (value) {

// 禁止重新對const型別變數賦值

if (this.kind === 'const') {

throw new TypeError('Assignment to constant variable')

} else {

this.value = value

}

get () {

return this.value

}

處理作用域問題思路，關鍵的地方就是在於JS語言本身尋找變數的特性——優先當前作用域，父作用域次之，全域性作用域最後。反過來，在節點處理函式VariableDeclaration()裡，如果遇到塊級作用域且關鍵字為var，則需要把這個變數也定義到父級作用域當中，這也就是我們常說的“全域性變數汙染”。

JS標準庫注入

細心的讀者會發現，在定義Scope基類的時候，其全域性作用域globalScope被賦值了一個standardMap物件，這個物件就是JS標準庫。

簡單來說，JS標準庫就是JS這門語言本身所帶有的一系列方法和屬性，如常用的setTimeout，console.log等等。為了讓解析器也能夠執行這些方法，所以我們需要為其注入標準庫：

const standardMap = {
  console: new SimpleValue(console)
}

const standardMap = {

console: new SimpleValue(console)

}

這樣就相當於往解析器的全域性作用域當中注入了console這個物件，也就可以直接被使用了。

六、節點處理器

在處理完節點遍歷器、作用域處理的工作之後，便可以來編寫節點處理器了。顧名思義，節點處理器是專門用來處理AST節點的，上文反覆提及的VariableDeclaration()方法便是其中一個。下面將對部分關鍵的節點處理器進行講解。

在開發節點處理器之前，需要用到一個工具，用於判斷JS語句當中的return，break，continue關鍵字。

關鍵字判斷工具`Signal`

定義一個Signal基類：

class Signal {
  constructor (type, value) {
    this.type = type
    this.value = value
  }

  static Return (value) {
    return new Signal('return', value)
  }

  static Break (label = null) {
    return new Signal('break', label)
  }

  static Continue (label) {
    return new Signal('continue', label)
  }

  static isReturn(signal) {
    return signal instanceof Signal && signal.type === 'return'
  }

  static isContinue(signal) {
    return signal instanceof Signal && signal.type === 'continue'
  }

  static isBreak(signal) {
    return signal instanceof Signal && signal.type === 'break'
  }

  static isSignal (signal) {
    return signal instanceof Signal
  }
}

class Signal {

constructor (type, value) {

this.type = type

this.value = value

}

static Return (value) {

return new Signal('return', value)

}

static Break (label = null) {

return new Signal('break', label)

}

static Continue (label) {

return new Signal('continue', label)

}

static isReturn(signal) {

return signal instanceof Signal && signal.type === 'return'

}

static isContinue(signal) {

return signal instanceof Signal && signal.type === 'continue'

}

static isBreak(signal) {

return signal instanceof Signal && signal.type === 'break'

}

static isSignal (signal) {

return signal instanceof Signal

}

有了它，就可以對語句當中的關鍵字進行判斷處理，接下來會有大用處。

1、變數定義節點處理器——`VariableDeclaration()`

最常用的節點處理器之一，負責把變數註冊到正確的作用域。

  VariableDeclaration (nodeIterator) {
    const kind = nodeIterator.node.kind
    for (const declaration of nodeIterator.node.declarations) {
      const { name } = declaration.id
      const value = declaration.init ? nodeIterator.traverse(declaration.init) : undefined
      // 在作用域當中定義變數
      // 若為塊級作用域且關鍵字為var，則需要做全域性汙染
      if (nodeIterator.scope.type === 'block' && kind === 'var') {
        nodeIterator.scope.parentScope.declare(name, value, kind)
      } else {
        nodeIterator.scope.declare(name, value, kind)
      }
    }
  },

VariableDeclaration (nodeIterator) {

const kind = nodeIterator.node.kind

for (const declaration of nodeIterator.node.declarations) {

const { name } = declaration.id

const value = declaration.init ? nodeIterator.traverse(declaration.init) : undefined

// 在作用域當中定義變數

// 若為塊級作用域且關鍵字為var，則需要做全域性汙染

if (nodeIterator.scope.type === 'block' && kind === 'var') {

nodeIterator.scope.parentScope.declare(name, value, kind)

} else {

nodeIterator.scope.declare(name, value, kind)

}

2、識別符號節點處理器——`Identifier()`

專門用於從作用域中獲取識別符號的值。

  Identifier (nodeIterator) {
    if (nodeIterator.node.name === 'undefined') {
      return undefined
    }
    return nodeIterator.scope.get(nodeIterator.node.name).value
  },

Identifier (nodeIterator) {

if (nodeIterator.node.name === 'undefined') {

return undefined

}

return nodeIterator.scope.get(nodeIterator.node.name).value

3、字元節點處理器——`Literal()`

返回字元節點的值。

  Literal (nodeIterator) {
    return nodeIterator.node.value
  }

Literal (nodeIterator) {

return nodeIterator.node.value

}

4、表示式呼叫節點處理器——`CallExpression()`

用於處理表示式呼叫節點的處理器，如處理func()，console.log()等。

  CallExpression (nodeIterator) {
    // 遍歷callee獲取函式體
    const func = nodeIterator.traverse(nodeIterator.node.callee)
    // 獲取引數
    const args = nodeIterator.node.arguments.map(arg => nodeIterator.traverse(arg))

    let value
    if (nodeIterator.node.callee.type === 'MemberExpression') {
      value = nodeIterator.traverse(nodeIterator.node.callee.object)
    }
    // 返回函式執行結果
    return func.apply(value, args)
  },

CallExpression (nodeIterator) {

// 遍歷callee獲取函式體

const func = nodeIterator.traverse(nodeIterator.node.callee)

// 獲取引數

const args = nodeIterator.node.arguments.map(arg => nodeIterator.traverse(arg))

let value

if (nodeIterator.node.callee.type === 'MemberExpression') {

value = nodeIterator.traverse(nodeIterator.node.callee.object)

}

// 返回函式執行結果

return func.apply(value, args)

5、表示式節點處理器——`MemberExpression()`

區分於上面的“表示式呼叫節點處理器”，表示式節點指的是person.say，console.log這種函式表示式。

  MemberExpression (nodeIterator) {
    // 獲取物件，如console
    const obj = nodeIterator.traverse(nodeIterator.node.object)
    // 獲取物件的方法，如log
    const name = nodeIterator.node.property.name
    // 返回表示式，如console.log
    return obj[name]
  }

MemberExpression (nodeIterator) {

// 獲取物件，如console

const obj = nodeIterator.traverse(nodeIterator.node.object)

// 獲取物件的方法，如log

const name = nodeIterator.node.property.name

// 返回表示式，如console.log

return obj[name]

}

6、塊級宣告節點處理器——`BlockStatement()`

非常常用的處理器，專門用於處理塊級宣告節點，如函式、迴圈、try...catch...當中的情景。

  BlockStatement (nodeIterator) {
    // 先定義一個塊級作用域
    let scope = nodeIterator.createScope('block')

    // 處理塊級節點內的每一個節點
    for (const node of nodeIterator.node.body) {
      if (node.type === 'VariableDeclaration' && node.kind === 'var') {
        for (const declaration of node.declarations) {
          scope.declare(declaration.id.name, declaration.init.value, node.kind)
        }
      } else if (node.type === 'FunctionDeclaration') {
        nodeIterator.traverse(node, { scope })
      }
    }

    // 提取關鍵字（return, break, continue）
    for (const node of nodeIterator.node.body) {
      if (node.type === 'FunctionDeclaration') {
        continue
      }
      const signal = nodeIterator.traverse(node, { scope })
      if (Signal.isSignal(signal)) {
        return signal
      }
    }
  }

BlockStatement (nodeIterator) {

// 先定義一個塊級作用域

let scope = nodeIterator.createScope('block')

// 處理塊級節點內的每一個節點

for (const node of nodeIterator.node.body) {

if (node.type === 'VariableDeclaration' && node.kind === 'var') {

for (const declaration of node.declarations) {

scope.declare(declaration.id.name, declaration.init.value, node.kind)

}

} else if (node.type === 'FunctionDeclaration') {

nodeIterator.traverse(node, { scope })

}

// 提取關鍵字（return, break, continue）

for (const node of nodeIterator.node.body) {

if (node.type === 'FunctionDeclaration') {

continue

}

const signal = nodeIterator.traverse(node, { scope })

if (Signal.isSignal(signal)) {

return signal

}

可以看到這個處理器裡面有兩個for...of迴圈。第一個用於處理塊級內語句，第二個專門用於識別關鍵字，如迴圈體內部的break，continue或者函式體內部的return。

7、函式定義節點處理器——`FunctionDeclaration()`

往作用當中宣告一個和函式名相同的變數，值為所定義的函式：

  FunctionDeclaration (nodeIterator) {
    const fn = NodeHandler.FunctionExpression(nodeIterator)
    nodeIterator.scope.varDeclare(nodeIterator.node.id.name, fn)
    return fn    
  }

FunctionDeclaration (nodeIterator) {

const fn = NodeHandler.FunctionExpression(nodeIterator)

nodeIterator.scope.varDeclare(nodeIterator.node.id.name, fn)

return fn

}

8、函式表示式節點處理器——`FunctionExpression()`

用於定義一個函式：

  FunctionExpression (nodeIterator) {
    const node = nodeIterator.node
    /**
     * 1、定義函式需要先為其定義一個函式作用域，且允許繼承父級作用域
     * 2、註冊`this`, `arguments`和形參到作用域的變數空間
     * 3、檢查return關鍵字
     * 4、定義函式名和長度
     */
    const fn = function () {
      const scope = nodeIterator.createScope('function')
      scope.constDeclare('this', this)
      scope.constDeclare('arguments', arguments)

      node.params.forEach((param, index) => {
        const name = param.name
        scope.varDeclare(name, arguments[index])
      })

      const signal = nodeIterator.traverse(node.body, { scope })
      if (Signal.isReturn(signal)) {
        return signal.value
      }
    }

    Object.defineProperties(fn, {
      name: { value: node.id ? node.id.name : '' },
      length: { value: node.params.length }
    })

    return fn
  }

FunctionExpression (nodeIterator) {

const node = nodeIterator.node

/**

* 1、定義函式需要先為其定義一個函式作用域，且允許繼承父級作用域

* 2、註冊`this`, `arguments`和形參到作用域的變數空間

* 3、檢查return關鍵字

* 4、定義函式名和長度

const fn = function () {

const scope = nodeIterator.createScope('function')

scope.constDeclare('this', this)

scope.constDeclare('arguments', arguments)

node.params.forEach((param, index) => {

const name = param.name

scope.varDeclare(name, arguments[index])

})

const signal = nodeIterator.traverse(node.body, { scope })

if (Signal.isReturn(signal)) {

return signal.value

}

Object.defineProperties(fn, {

name: { value: node.id ? node.id.name : '' },

length: { value: node.params.length }

})

return fn

}

9、this表示式處理器——`ThisExpression()`

該處理器直接使用JS語言自身的特性，把this關鍵字從作用域中取出即可。

  ThisExpression (nodeIterator) {
    const value = nodeIterator.scope.get('this')
    return value ? value.value : null
  }

ThisExpression (nodeIterator) {

const value = nodeIterator.scope.get('this')

return value ? value.value : null

}

10、new表示式處理器——`NewExpression()`

和this表示式類似，也是直接沿用JS的語言特性，獲取函式和引數之後，通過bind關鍵字生成一個建構函式，並返回。

  NewExpression (nodeIterator) {
    const func = nodeIterator.traverse(nodeIterator.node.callee)
    const args = nodeIterator.node.arguments.map(arg => nodeIterator.traverse(arg))
    return new (func.bind(null, ...args))
  }

NewExpression (nodeIterator) {

const func = nodeIterator.traverse(nodeIterator.node.callee)

const args = nodeIterator.node.arguments.map(arg => nodeIterator.traverse(arg))

return new (func.bind(null, ...args))

}

11、For迴圈節點處理器——`ForStatement()`

For迴圈的三個引數對應著節點的init，test，update屬性，對著三個屬性分別呼叫節點處理器處理，並放回JS原生的for迴圈當中即可。

  ForStatement (nodeIterator) {
    const node = nodeIterator.node
    let scope = nodeIterator.scope
    if (node.init && node.init.type === 'VariableDeclaration' && node.init.kind !== 'var') {
      scope = nodeIterator.createScope('block')
    }

    for (
      node.init && nodeIterator.traverse(node.init, { scope });
      node.test ? nodeIterator.traverse(node.test, { scope }) : true;
      node.update && nodeIterator.traverse(node.update, { scope })
    ) {
      const signal = nodeIterator.traverse(node.body, { scope })
      
      if (Signal.isBreak(signal)) {
        break
      } else if (Signal.isContinue(signal)) {
        continue
      } else if (Signal.isReturn(signal)) {
        return signal
      }
    }
  }

ForStatement (nodeIterator) {

const node = nodeIterator.node

let scope = nodeIterator.scope

if (node.init && node.init.type === 'VariableDeclaration' && node.init.kind !== 'var') {

scope = nodeIterator.createScope('block')

}

for (

node.init && nodeIterator.traverse(node.init, { scope });

node.test ? nodeIterator.traverse(node.test, { scope }) : true;

node.update && nodeIterator.traverse(node.update, { scope })

) {

const signal = nodeIterator.traverse(node.body, { scope })

if (Signal.isBreak(signal)) {

break

} else if (Signal.isContinue(signal)) {

continue

} else if (Signal.isReturn(signal)) {

return signal

}

同理，for...in，while和do...while迴圈也是類似的處理方式，這裡不再贅述。

12、If宣告節點處理器——`IfStatemtnt()`

處理If語句，包括if，if...else，if...elseif...else。

  IfStatement (nodeIterator) {
    if (nodeIterator.traverse(nodeIterator.node.test)) {
      return nodeIterator.traverse(nodeIterator.node.consequent)
    } else if (nodeIterator.node.alternate) {
      return nodeIterator.traverse(nodeIterator.node.alternate)
    }
  }

IfStatement (nodeIterator) {

if (nodeIterator.traverse(nodeIterator.node.test)) {

return nodeIterator.traverse(nodeIterator.node.consequent)

} else if (nodeIterator.node.alternate) {

return nodeIterator.traverse(nodeIterator.node.alternate)

}

同理，switch語句、三目表示式也是類似的處理方式。

—

上面列出了幾個比較重要的節點處理器，在es5當中還有很多節點需要處理，詳細內容可以訪問這個地址一探究竟。

七、定義呼叫方式

經過了上面的所有步驟，解析器已經具備處理es5程式碼的能力，接下來就是對這些散裝的內容進行組裝，最終定義一個方便使用者呼叫的辦法。

const { Parser } = require('acorn')
const NodeIterator = require('./iterator')
const Scope = require('./scope')

class Canjs {
  constructor (code = '', extraDeclaration = {}) {
    this.code = code
    this.extraDeclaration = extraDeclaration
    this.ast = Parser.parse(code)
    this.nodeIterator = null
    this.init()
  }

  init () {
    // 定義全域性作用域，該作用域型別為函式作用域
    const globalScope = new Scope('function')
    // 根據入參定義標準庫之外的全域性變數
    Object.keys(this.extraDeclaration).forEach((key) => {
      globalScope.addDeclaration(key, this.extraDeclaration[key])
    })
    this.nodeIterator = new NodeIterator(null, globalScope)
  }

  run () {
    return this.nodeIterator.traverse(this.ast)
  }
}

const { Parser } = require('acorn')

const NodeIterator = require('./iterator')

const Scope = require('./scope')

class Canjs {

constructor (code = '', extraDeclaration = {}) {

this.code = code

this.extraDeclaration = extraDeclaration

this.ast = Parser.parse(code)

this.nodeIterator = null

this.init()

}

init () {

// 定義全域性作用域，該作用域型別為函式作用域

const globalScope = new Scope('function')

// 根據入參定義標準庫之外的全域性變數

Object.keys(this.extraDeclaration).forEach((key) => {

globalScope.addDeclaration(key, this.extraDeclaration[key])

})

this.nodeIterator = new NodeIterator(null, globalScope)

}

run () {

return this.nodeIterator.traverse(this.ast)

}

這裡我們定義了一個名為Canjs的基類，接受字串形式的JS程式碼，同時可定義標準庫之外的變數。當執行run()方法的時候就可以得到執行結果。

八、後續

至此，整個JS解析器已經完成，可以很好地執行ES5的程式碼（可能還有bug沒有發現）。但是在當前的實現中，所有的執行結果都是放在一個類似沙盒的地方，無法對外界產生影響。如果要把執行結果取出來，可能的辦法有兩種。第一種是傳入一個全域性的變數，把影響作用在這個全域性變數當中，藉助它把結果帶出來；另外一種則是讓解析器支援export語法，能夠把export語句宣告的結果返回，感興趣的讀者可以自行研究。

最後，這個JS解析器已經在我的Github上開源，歡迎前來交流~

https://github.com/jrainlau/c…

參考資料

從零開始寫一個Javascript解析器

微信小程式也要強行熱更程式碼，鵝廠不服你來肛我呀

jkeylu/evil-eval

前端與編譯原理——用 JS 寫一個 JS 直譯器
2019-02-02
前端編譯原理JS
「 giao-js 」用js寫一個js直譯器
2020-11-23
JS
用java寫一個lisp 直譯器
2022-02-07
JavaLisp
前端工具 | JS編譯器Monaco使用教程
2021-06-08
前端JS編譯
如何寫一個js模組打包器(翻譯)
2019-03-01
JS
JS 編譯器都做了啥？
2019-01-02
JS編譯
淺談彙編器、編譯器和直譯器
2019-06-26
編譯
Node.js REPL(互動式直譯器)
2020-12-22
Node.js
編譯原理實戰入門：用 JavaScript 寫一個簡單的四則運算編譯器（修訂版）
2020-11-10
編譯原理JavaScript
用 golang 寫一個語言（編譯器，虛擬機器）
2020-05-08
Golang編譯虛擬機
JS學習系列 01 - 編譯原理和作用域
2018-04-11
JS編譯原理
自己動手寫basic直譯器一
2018-11-13
從 JS 編譯原理到作用域(鏈)及閉包
2019-04-07
JS編譯原理
源語言、目標語言、翻譯器、編譯器、直譯器
2019-05-07
編譯
如何編寫一個前端框架之一－專案結構（譯）
2018-04-16
前端框架
如何編寫一個前端框架之六－自定義元素(譯)
2018-04-19
前端框架
請用js編寫一個紅綠燈程式
2024-11-25
JS
使用JS實現JS編譯器，並將目標js生成二進位制
2018-09-07
JS編譯
然並卵：BF 科普 & BF 直譯器的 JS 實現
2018-08-12
JS
Typescript編譯原理（一）
2018-12-22
TypeScript編譯原理
[譯]用Golang編寫一個簡易聊天室
2019-10-23
Golang
如何編寫一個前端框架之七－客戶端路由(譯)
2019-02-24
前端框架客戶端路由
js預編譯 --預編譯詳解四部曲
2020-10-31
JS編譯
聊聊Vue.js的template編譯
2019-03-04
Vue.js編譯
然並卵系列：來寫個 Brainfuck 直譯器吧
2018-08-22
AI
編譯原理
2024-08-10
編譯原理
如何編寫一個前端框架之三－程式碼執行沙箱(譯)
2018-04-17
前端框架
[譯]使用 Rust 編寫快速安全的原生 Node.js 模組
2019-03-04
RustNode.js
[譯] 教程 — 用 C 寫一個 Shell
2019-02-25
[譯] 用 Rust 寫一個微服務
2019-03-02
Rust微服務
用java寫lisp 直譯器（10 實現物件和類）
2022-02-19
JavaLisp物件
JS 預編譯程式碼例項分析
2024-11-29
JS編譯
[譯] 使用 Vue 編寫一個長按指令
2019-02-28
Vue
記一次寫的編譯原理文法屎山
2024-05-15
編譯原理
Vue原理解析：手寫編譯器(節點解析) —— Compile
2021-01-04
Vue編譯Compile
[譯] 用 Flask 和 Vue.js 開發一個單頁面應用
2019-03-04
FlaskVue.js
如何編寫一個前端框架之四－資料繫結簡介(譯)
2018-04-18
前端框架
Ipython 直譯器
2019-02-16
Python

前端與編譯原理——用JS寫一個JS直譯器

一、為什麼要用JS寫JS的直譯器

二、什麼是編譯器

三、什麼是直譯器

四、節點遍歷器

五、作用域處理

JS標準庫注入

六、節點處理器

關鍵字判斷工具Signal

1、變數定義節點處理器——VariableDeclaration()

2、識別符號節點處理器——Identifier()

3、字元節點處理器——Literal()

4、表示式呼叫節點處理器——CallExpression()

5、表示式節點處理器——MemberExpression()

6、塊級宣告節點處理器——BlockStatement()

7、函式定義節點處理器——FunctionDeclaration()

8、函式表示式節點處理器——FunctionExpression()

9、this表示式處理器——ThisExpression()

10、new表示式處理器——NewExpression()

11、For迴圈節點處理器——ForStatement()

12、If宣告節點處理器——IfStatemtnt()

七、定義呼叫方式

八、後續

參考資料

相關文章

關鍵字判斷工具`Signal`

1、變數定義節點處理器——`VariableDeclaration()`

2、識別符號節點處理器——`Identifier()`

3、字元節點處理器——`Literal()`

4、表示式呼叫節點處理器——`CallExpression()`

5、表示式節點處理器——`MemberExpression()`

6、塊級宣告節點處理器——`BlockStatement()`

7、函式定義節點處理器——`FunctionDeclaration()`

8、函式表示式節點處理器——`FunctionExpression()`

9、this表示式處理器——`ThisExpression()`

10、new表示式處理器——`NewExpression()`

11、For迴圈節點處理器——`ForStatement()`

12、If宣告節點處理器——`IfStatemtnt()`