精讀《函式快取》

1 引言

函式快取是重要概念，本質上就是用空間（快取儲存）換時間（跳過計算過程）。

對於無副作用的純函式，在合適的場景使用函式快取是非常必要的，讓我們跟著 https://whatthefork.is/memoiz... 這篇文章深入理解一下函式快取吧！

2 概述

假設又一個獲取天氣的函式 getChanceOfRain，每次呼叫都要花 100ms 計算：

import { getChanceOfRain } from "magic-weather-calculator";
function showWeatherReport() {
  let result = getChanceOfRain(); // Let the magic happen
  console.log("The chance of rain tomorrow is:", result);
}

showWeatherReport(); // (!) Triggers the calculation
showWeatherReport(); // (!) Triggers the calculation
showWeatherReport(); // (!) Triggers the calculation

很顯然這樣太浪費計算資源了，當已經計算過一次天氣後，就沒有必要再算一次了，我們期望的是後續呼叫可以直接拿上一次結果的快取，這樣可以節省大量計算。因此我們可以做一個 memoizedGetChanceOfRain 函式快取計算結果：

import { getChanceOfRain } from "magic-weather-calculator";
let isCalculated = false;
let lastResult;
// We added this function!
function memoizedGetChanceOfRain() {
  if (isCalculated) {
    // No need to calculate it again.
    return lastResult;
  }
  // Gotta calculate it for the first time.
  let result = getChanceOfRain();
  // Remember it for the next time.
  lastResult = result;
  isCalculated = true;
  return result;
}
function showWeatherReport() {
  // Use the memoized function instead of the original function.
  let result = memoizedGetChanceOfRain();
  console.log("The chance of rain tomorrow is:", result);
}

在每次呼叫時判斷優先用快取，如果沒有快取則呼叫原始函式並記錄快取。這樣當我們多次呼叫時，除了第一次之外都會立即從快取中返回結果：

showWeatherReport(); // (!) Triggers the calculation
showWeatherReport(); // Uses the calculated result
showWeatherReport(); // Uses the calculated result
showWeatherReport(); // Uses the calculated result

然而對於有引數的場景就不適用了，因為快取並沒有考慮引數：

function showWeatherReport(city) {
  let result = getChanceOfRain(city); // Pass the city
  console.log("The chance of rain tomorrow is:", result);
}

showWeatherReport("Tokyo"); // (!) Triggers the calculation
showWeatherReport("London"); // Uses the calculated answer

由於引數可能性很多，所以有三種解決方案：

1. 僅快取最後一次結果

僅快取最後一次結果是最節省儲存空間的，而且不會有計算錯誤，但帶來的問題就是當引數變化時快取會立即失效：

import { getChanceOfRain } from "magic-weather-calculator";
let lastCity;
let lastResult;
function memoizedGetChanceOfRain(city) {
  if (city === lastCity) {
    // Notice this check!
    // Same parameters, so we can reuse the last result.
    return lastResult;
  }
  // Either we're called for the first time,
  // or we're called with different parameters.
  // We have to perform the calculation.
  let result = getChanceOfRain(city);
  // Remember both the parameters and the result.
  lastCity = city;
  lastResult = result;
  return result;
}
function showWeatherReport(city) {
  // Pass the parameters to the memoized function.
  let result = memoizedGetChanceOfRain(city);
  console.log("The chance of rain tomorrow is:", result);
}

showWeatherReport("Tokyo"); // (!) Triggers the calculation
showWeatherReport("Tokyo"); // Uses the calculated result
showWeatherReport("Tokyo"); // Uses the calculated result
showWeatherReport("London"); // (!) Triggers the calculation
showWeatherReport("London"); // Uses the calculated result

在極端情況下等同於沒有快取：

showWeatherReport("Tokyo"); // (!) Triggers the calculation
showWeatherReport("London"); // (!) Triggers the calculation
showWeatherReport("Tokyo"); // (!) Triggers the calculation
showWeatherReport("London"); // (!) Triggers the calculation
showWeatherReport("Tokyo"); // (!) Triggers the calculation

2. 快取所有結果

第二種方案是快取所有結果，使用 Map 儲存快取即可：

// Remember the last result *for every city*.
let resultsPerCity = new Map();
function memoizedGetChanceOfRain(city) {
  if (resultsPerCity.has(city)) {
    // We already have a result for this city.
    return resultsPerCity.get(city);
  }
  // We're called for the first time for this city.
  let result = getChanceOfRain(city);
  // Remember the result for this city.
  resultsPerCity.set(city, result);
  return result;
}
function showWeatherReport(city) {
  // Pass the parameters to the memoized function.
  let result = memoizedGetChanceOfRain(city);
  console.log("The chance of rain tomorrow is:", result);
}

showWeatherReport("Tokyo"); // (!) Triggers the calculation
showWeatherReport("London"); // (!) Triggers the calculation
showWeatherReport("Tokyo"); // Uses the calculated result
showWeatherReport("London"); // Uses the calculated result
showWeatherReport("Tokyo"); // Uses the calculated result
showWeatherReport("Paris"); // (!) Triggers the calculation

這麼做帶來的弊端就是記憶體溢位，當可能引數過多時會導致記憶體無限制的上漲，最壞的情況就是觸發瀏覽器限制或者頁面崩潰。

3. 其他快取策略

介於只快取最後一項與快取所有項之間還有這其他選擇，比如 LRU（least recently used）只保留最小化最近使用的快取，或者為了方便瀏覽器回收，使用 WeakMap 替代 Map。

最後提到了函式快取的一個坑，必須是純函式。比如下面的 CASE：

// Inside the magical npm package
function getChanceOfRain() {
  // Show the input box!
  let city = prompt("Where do you live?");
  // ... calculation ...
}
// Our code
function showWeatherReport() {
  let result = getChanceOfRain();
  console.log("The chance of rain tomorrow is:", result);
}

getChanceOfRain 每次會由使用者輸入一些資料返回結果，導致快取錯誤，原因是 “函式入參一部分由使用者輸入” 就是副作用，我們不能對有副作用的函式進行快取。

這有時候也是拆分函式的意義，將一個有副作用函式的無副作用部分分解出來，這樣就能區域性做函式快取了：

// If this function only calculates things,
// we would call it "pure".
// It is safe to memoize this function.
function getChanceOfRain(city) {
  // ... calculation ...
}
// This function is "impure" because
// it shows a prompt to the user.
function showWeatherReport() {
  // The prompt is now here
  let city = prompt("Where do you live?");
  let result = getChanceOfRain(city);
  console.log("The chance of rain tomorrow is:", result);
}

最後，我們可以將快取函式抽象為高階函式：

function memoize(fn) {
  let isCalculated = false;
  let lastResult;
  return function memoizedFn() {
    // Return the generated function!
    if (isCalculated) {
      return lastResult;
    }
    let result = fn();
    lastResult = result;
    isCalculated = true;
    return result;
  };
}

這樣生成新的快取函式就方便啦：

let memoizedGetChanceOfRain = memoize(getChanceOfRain);
let memoizedGetNextEarthquake = memoize(getNextEarthquake);
let memoizedGetCosmicRaysProbability = memoize(getCosmicRaysProbability);

isCalculated 與 lastResult 都儲存在 memoize 函式生成的閉包內，外部無法訪問。

3 精讀

通用高階函式實現函式快取

原文的例子還是比較簡單，沒有考慮函式多個引數如何處理，下面我們分析一下 Lodash memoize 函式原始碼：

function memoize(func, resolver) {
  if (
    typeof func != "function" ||
    (resolver != null && typeof resolver != "function")
  ) {
    throw new TypeError(FUNC_ERROR_TEXT);
  }
  var memoized = function () {
    var args = arguments,
      key = resolver ? resolver.apply(this, args) : args[0],
      cache = memoized.cache;

    if (cache.has(key)) {
      return cache.get(key);
    }
    var result = func.apply(this, args);
    memoized.cache = cache.set(key, result) || cache;
    return result;
  };
  memoized.cache = new (memoize.Cache || MapCache)();
  return memoized;
}

原文有提到快取策略多種多樣，而 Lodash 將快取策略簡化為 key 交給使用者自己管理，看這段程式碼：

key = resolver ? resolver.apply(this, args) : args[0];

也就是快取的 key 預設是執行函式時第一個引數，也可以通過 resolver 拿到引數處理成新的快取 key。

在執行函式時也傳入了引數 func.apply(this, args)。

最後 cache 也不再使用預設的 Map，而是允許使用者自定義 lodash.memoize.Cache 自行設定，比如設定為 WeakMap：

_.memoize.Cache = WeakMap;

什麼時候不適合用快取

以下兩種情況不適合用快取：

不經常執行的函式。
本身執行速度較快的函式。

對於不經常執行的函式，本身就不需要利用快取提升執行效率，而快取反而會長期佔用記憶體。對於本身執行速度較快的函式，其實大部分簡單計算速度都很快，使用快取後對速度沒有明顯的提升，同時如果計算結果比較大，反而會佔用儲存資源。

對於引用的變化尤其重要，比如如下例子：

function addName(obj, name){
  return {
    ...obj,
    name:
  }
}

為 obj 新增一個 key，本身執行速度是非常快的，但新增快取後會帶來兩個壞處：

如果 obj 非常大，會在閉包儲存完整 obj 結構，記憶體佔用加倍。
如果 obj 通過 mutable 方式修改了，則普通快取函式還會返回原先結果（因為物件引用沒有變），造成錯誤。

如果要強行進行物件深對比，雖然會避免出現邊界問題，但效能反而會大幅下降。

4 總結

函式快取非常有用，但並不是所有場景都適用，因此千萬不要極端的將所有函式都新增快取，僅限於計算耗時、可能重複利用多次，且是純函式的。

討論地址是：精讀《函式快取》· Issue #261 · dt-fe/weekly

如果你想參與討論，請點選這裡，每週都有新的主題，週末或週一釋出。前端精讀 - 幫你篩選靠譜的內容。

關注 前端精讀微信公眾號

版權宣告：自由轉載-非商用-非衍生-保持署名（創意共享 3.0 許可證）

本文使用 mdnice 排版