業界 | 谷歌開源DeepLearn.js：可在網頁上實現硬體加速的機器學習

機器之心發表於2017-08-08

deeplearn.js 是一個可用於機器智慧並加速 WebGL 的開源 JavaScript 庫。deeplearn.js 提供高效的機器學習構建模組，使我們能夠在瀏覽器中訓練神經網路或在推斷模式中執行預訓練模型。它提供構建可微資料流圖的 API，以及一系列可直接使用的數學函式。

本文件中，我們使用 TypeScript 程式碼示例。對於 vanilla JavaScript，你可能需要移除 TypeScript 語法，如 const、let 或其他型別定義。

核心概念

NDArrays

deeplearn.js 的核心資料單元是 NDArray。NDArray 包括一系列浮點值，它們將構建為任意維數的陣列。NDArray 具備一個用來定義形狀的 shape 屬性。該庫為低秩 NDArray 提供糖子類（sugar subclasses）：Scalar、Array1D、Array2D、Array3D 和 Array4D。

2x3 矩陣的用法示例：

const shape = [2, 3];  // 2 rows, 3 columns
const a = Array2D.new(shape, [1.0, 2.0, 3.0, 10.0, 20.0, 30.0]);

NDArray 可作為 WebGLTexture 在 GPU 上儲存資料，每一個畫素儲存一個浮點值；或者作為 vanilla JavaScript TypedArray 在 CPU 上儲存資料。大多數時候，使用者不應思考儲存問題，因為這只是一個實現細節。

如果 NDArray 資料儲存在 CPU 上，那麼 GPU 數學操作第一次被呼叫時，該資料將被自動上傳至一個 texture。如果你在 GPU 常駐記憶體的 NDArray 上呼叫 NDArray.getValues()，則該庫將下載該 texture 至 CPU，然後將其刪除。

NDArrayMath

該庫提供一個 NDArrayMath 基類，其為定義在 NDArray 上執行的一系列數學函式。

NDArrayMathGPU

當使用 NDArrayMathGPU 實現時，這些數學運算對將在 GPU 上執行的著色器程式（shader program）排序。和 NDArrayMathCPU 中不同，這些運算不會阻塞，但使用者可以通過在 NDArray 上呼叫 get() 或 getValues() 使 cpu 和 gpu 同步化，詳見下文。

這些著色器從 NDArray 上的 WebGLTexture 中讀取和寫入。當連線數學運算時，紋理可以停留在 GPU 記憶體中（不必下載至運算之間的 CPU），這對效能來說非常關鍵。

以兩個矩陣間的均方差為例（有關 math.scope、keep 和 track 的更多細節，詳見下文）：

const math = new NDArrayMathGPU();

math.scope((keep, track) => {
  const a = track(Array2D.new([2, 2], [1.0, 2.0, 3.0, 4.0]));
  const b = track(Array2D.new([2, 2], [0.0, 2.0, 4.0, 6.0]));

  // Non-blocking math calls.
  const diff = math.sub(a, b);
  const squaredDiff = math.elementWiseMul(diff, diff);
  const sum = math.sum(squaredDiff);
  const size = Scalar.new(a.size);
  const average = math.divide(sum, size);

  // Blocking call to actually read the values from average. Waits until the
  // GPU has finished executing the operations before returning values.
  console.log(average.get());  // average is a Scalar so we use .get()
});

注：NDArray.get() 和 NDArray.getValues() 是阻塞呼叫。因此在執行一系列數學函式之後，無需寄存回撥函式，只需呼叫 getValues() 來使 CPU 和 GPU 同步化。

小技巧：避免在 GPU 數學運算之間呼叫 get() 或 getValues()，除非你在進行除錯。因為這會強制下載 texture，然後後續的 NDArrayMathGPU 呼叫將不得不重新下載資料至新的 texture 中。

math.scope()

當我們進行數學運算時，我們需要像如上所示的案例那樣將它們封裝到 math.scope() 函式的閉包中。該數學運算的結果將在作用域的端點處得到配置，除非該函式在作用域內返回函式值。

有兩種函式可以傳遞到函式閉包中：keep() 和 track()。

keep() 確保了 NDArray 將得到傳遞並保留，它不會在作用域範圍結束後被自動清除。
track() 追蹤了我們在作用域內直接構建的 NDArray。當作用域結束時，任何手動追蹤的 NDArray 都將會被清除。math.method() 函式的結果和其它核心庫函式的結果一樣將會被自動清除，所以我們也不必手動追蹤它們。

const math = new NDArrayMathGPU();

let output;

// You must have an outer scope, but don't worry, the library will throw an
// error if you don't have one.
math.scope((keep, track) => {
  // CORRECT: By default, math wont track NDArrays that are constructed
  // directly. You can call track() on the NDArray for it to get tracked and
  // cleaned up at the end of the scope.
  const a = track(Scalar.new(2));

  // INCORRECT: This is a texture leak!!
  // math doesn't know about b, so it can't track it. When the scope ends, the
  // GPU-resident NDArray will not get cleaned up, even though b goes out of
  // scope. Make sure you call track() on NDArrays you create.
  const b = Scalar.new(2);

  // CORRECT: By default, math tracks all outputs of math functions.
  const c = math.neg(math.exp(a));

  // CORRECT: d is tracked by the parent scope.
  const d = math.scope(() => {
    // CORRECT: e will get cleaned up when this inner scope ends.
    const e = track(Scalar.new(3));

    // CORRECT: The result of this math function is tracked. Since it is the
    // return value of this scope, it will not get cleaned up with this inner
    // scope. However, the result will be tracked automatically in the parent
    // scope.
    return math.elementWiseMul(e, e);
  });

  // CORRECT, BUT BE CAREFUL: The output of math.tanh will be tracked
  // automatically, however we can call keep() on it so that it will be kept
  // when the scope ends. That means if you are not careful about calling
  // output.dispose() some time later, you might introduce a texture memory
  // leak. A better way to do this would be to return this value as a return
  // value of a scope so that it gets tracked in a parent scope.
  output = keep(math.tanh(d));
});

技術細節：當 WebGL textures 在 JavaScript 的作用範圍之外時，它們因為瀏覽器的碎片回收機制而不會被自動清除。這就意味著當我們使用 GPU 常駐記憶體完成了 NDArray 時，它隨後需要手動地配置。如果我們完成 NDArray 時忘了手動呼叫 ndarray.dispose()，那就會引起 texture 記憶體滲漏，這將會導致十分嚴重的效能問題。如果我們使用 math.scope()，任何由 math.method() 建立的 NDArray 或其它通過作用域返回函式值方法建立的 NDArray 都會被自動清除。

如果我們想不使用 math.scope()，並且手動配置記憶體，那麼我們可以令 safeMode = false 來構建 NDArrayMath 物件。這種方法我們並不推薦，但是因為 CPU 常駐記憶體可以通過 JavaScript 碎片回收器自動清除，所以它對 NDArrayMathCPU 十分有用。

NDArrayMathCPU

當我們使用 CPU 實現模型時，這些數學運算是封閉的並且可以通過 vanilla JavaScript 在底層 TypedArray 上立即執行。

訓練

在 deeplearn.js 中的可微資料流圖使用的是延遲執行模型，這一點就和 TensorFlow 一樣。使用者可以通過 FeedEntrys 提供的輸入 NDArray 構建一個計算圖，然後再在上面進行訓練或推斷。

注意：NDArrayMath 和 NDArrays 對於推斷模式來說是足夠的，如果我們希望進行訓練，只需要一個圖就行。

圖和張量

Graph 物件是構建資料流圖的核心類別，Graph 物件實際上並不保留 NDArray 資料，它只是在運算中構建連線。

Graph 類像頂層成員函式（member function）一樣有可微分運算。當我們呼叫一個圖方法來新增運算時，我們就會獲得一個 Tensor 物件，它僅僅保持連通性和形狀資訊。

下面是一個將輸入和變數做乘積的計算圖示例：

const g = new Graph();

// Placeholders are input containers. This is the container for where we will
// feed an input NDArray when we execute the graph.
const inputShape = [3];
const inputTensor = g.placeholder('input', inputShape);

const labelShape = [1];
const inputTensor = g.placeholder('label', labelShape);

// Variables are containers that hold a value that can be updated from training.
// Here we initialize the multiplier variable randomly.
const multiplier = g.variable('multiplier', Array2D.randNormal([1, 3]));

// Top level graph methods take Tensors and return Tensors.
const outputTensor = g.matmul(multiplier, inputTensor);
const costTensor = g.meanSquaredCost(outputTensor, labelTensor);

// Tensors, like NDArrays, have a shape attribute.
console.log(outputTensor.shape);

Session 和 FeedEntry

Session 物件是驅動執行計算圖的方法，FeedEntry 物件（和 TensorFlow 中的 feed_dict 類似）將提供執行所需的資料，並從給定的 NDArray 中饋送一個值給 Tensor 物件。

批處理簡單的註釋：deeplearn.js 並沒有執行批處理作為運算的外部維度（outer dimension）。這就意味著每一個頂層圖運算就像數學函式那樣在單個樣本上運算。然而，批處理十分重要，以至於權重的更新依賴於每一個批量的梯度均值。deeplearn.js 在訓練 FeedEntry 時通過使用 InputerProvider 模擬批處理來提供輸入向量，而不是直接使用 NDArray。因此，每一個批量中的每一項都會呼叫 InputerProvider。我們同樣提供了 InMemoryShuffledInputProviderBuilder 來清洗一系列輸入並保持它們的同步性。

通過上面的 Graph 物件訓練：

const learningRate = .001;
const batchSize = 2;

const math = new NDArrayMathGPU();
const session = new Session(g, math);
const optimizer = new SGDOptimizer(learningRate);

const inputs: Array1D[] = [
  Array1D.new([1.0, 2.0, 3.0]),
  Array1D.new([10.0, 20.0, 30.0]),
  Array1D.new([100.0, 200.0, 300.0])
];

const labels: Array1D[] = [
  Array1D.new([2.0, 6.0, 12.0]),
  Array1D.new([20.0, 60.0, 120.0]),
  Array1D.new([200.0, 600.0, 1200.0])
];

// Shuffles inputs and labels and keeps them mutually in sync.
const shuffledInputProviderBuilder =
   new InCPUMemoryShuffledInputProviderBuilder([inputs, labels]);
const [inputProvider, labelProvider] =
   shuffledInputProviderBuilder.getInputProviders();

// Maps tensors to InputProviders.
const feedEntries: FeedEntry[] = [
  {tensor: inputTensor, data: inputProvider},
  {tensor: labelTensor, data: labelProvider}
];

// Wrap session.train in a scope so the cost gets cleaned up automatically.
math.scope(() => {
  // Train takes a cost tensor to minimize. Trains one batch. Returns the
  // average cost as a Scalar.
  const cost = session.train(
      costTensor, feedEntries, batchSize, optimizer, CostReduction.MEAN);

  console.log('last average cost: ' + cost.get());
});

在訓練後，我們就可以通過圖進行推斷：


// Wrap session.eval in a scope so the intermediate values get cleaned up
// automatically.
math.scope((keep, track) => {
  const testInput = track(Array1D.new([1.0, 2.0, 3.0]));

  // session.eval can take NDArrays as input data.
  const testFeedEntries: FeedEntry[] = [
    {tensor: inputTensor, data: testInput}
  ];

  const testOutput = session.eval(outputTensor, testFeedEntries);

  console.log('inference output:');
  console.log(testOutput.shape);
  console.log(testOutput.getValues());
});

詳情請檢視文件教程：https://pair-code.github.io/deeplearnjs/docs/tutorials/
原文地址：https://pair-code.github.io/deeplearnjs/docs/tutorials/intro.html

開源機器學習庫Deeplearn.js
2017-08-17
機器學習JS
谷歌全面開源 MLIR 及生態聯盟，全球 95% 的加速器硬體都在使用
2019-09-12
谷歌
Android的硬體加速
2015-08-24
Android
硬體加速gpu計劃開還是不開 win10硬體加速gpu計劃有用嗎
2022-05-26
GPUWin10
谷歌招人開發VR硬體蘋果把谷歌人才挖走
2016-01-25
谷歌VR蘋果
全同態加密的硬體加速：讓機器學習更懂隱私保護
2024-01-26
加密機器學習
用機器學習構建O(N)複雜度的排序演算法，可在GPU和TPU上加速計算
2018-05-16
機器學習複雜度排序演算法GPU
網上書店Html網頁—— table實現
2018-02-20
HTML網頁
Android 4.0硬體加速的使用
2013-04-28
Android
Wire：基於安卓的谷歌的Protocol Buffers的開源實現
2021-06-09
安卓谷歌Protocol
業界動態：用開源軟體管理資料中心(轉)
2007-08-16
win10硬體加速沒有gpu計劃強制開啟win10硬體加速gpu計劃的方法
2022-01-24
Win10GPU
比OpenAI快8倍的新AI影像生成器，可在廉價硬體上執行
2024-02-29
OpenAI
學習通過CSS硬體加速提升你網站的效能
2016-01-04
CSS網站
自定義View的硬體加速問題
2018-04-16
View
理解Android硬體加速的小白文
2017-11-30
Android
開源資料庫商業化加速，雲端計算助推開源軟體發展
2022-01-29
資料庫
“開源軟體”幫中小企業實現成本“節流”
2007-09-29
谷歌瀏覽器打不開網頁怎麼解決谷歌瀏覽器電腦上無法開啟網頁解決方法
2022-04-09
谷歌瀏覽器網頁
谷歌熱氣球上網計劃實現方法解密
2015-12-09
谷歌解密
System76 是如何打造開源硬體的
2019-05-03
痞子衡嵌入式：對比MbedTLS演算法庫純軟體實現與i.MXRT上DCP,CAAM硬體加速器實現效能差異
2022-02-19
TLS演算法
硬體3D加速指南(轉)
2007-08-10
3D
win10系統下如何開啟顯示卡硬體加速
2019-03-14
Win10
谷歌瀏覽器電腦上無法開啟網頁怎麼辦
2022-06-23
谷歌瀏覽器網頁
物件導向開發方式的開源硬體--.NET Gadgeteer
2014-04-16
物件
機器學習為核心，DeepMind助力谷歌開發的安卓 9「Pie」今日上線
2018-08-07
機器學習谷歌安卓
關於網路硬體配置出現問題，無法上網問題的解決
2020-10-01
centos7實現上網和管理網路yum源
2019-02-26
CentOS
業界使用的兩種主要機器學習技術 -svpino
2022-01-11
機器學習
android4.0 開啟硬體加速後應用執行出錯 android4.0 開啟硬體加速後應用執行出錯...
2013-04-28
Android
視訊工具箱和硬體加速
2015-06-08
win10關閉硬體加速怎麼關閉_win10禁用顯示卡硬體加速怎麼操作
2020-06-30
Win10
為什麼win10的谷歌瀏覽器打不開網頁_win10谷歌打不開網頁的解決方法
2020-01-02
Win10谷歌瀏覽器網頁
為什麼機器學習行業的發展離不開 “開源”
2017-03-29
機器學習行業
全民加速節：全站加速在網際網路媒體應用上的最佳實踐
2020-08-19
windows10系統關閉硬體加速的方法
2019-01-02
Windows
win10硬體加速怎麼調高_win10硬體加速如何設定成最高效能
2020-07-12
Win10

業界 | 谷歌開源DeepLearn.js：可在網頁上實現硬體加速的機器學習

核心概念

相關文章