場景介紹

Neural Network Runtime 作為 AI 推理引擎和加速晶片的橋樑，為 AI 推理引擎提供精簡的 Native 介面，滿足推理引擎透過加速晶片執行端到端推理的需求。

本文以圖 1 展示的 Add 單運算元模型為例，介紹 Neural Network Runtime 的開發流程。Add 運算元包含兩個輸入、一個引數和一個輸出，其中的 activation 引數用於指定 Add 運算元中啟用函式的型別。

圖 1 Add 單運算元網路示意圖

HarmonyOS：Neural Network Runtime 對接 AI 推理框架開發指導

環境準備

環境要求

Neural Network Runtime 部件的環境要求如下：

● 開發環境：Ubuntu 18.04 及以上。

● 接入裝置：HarmonyOS 定義的標準裝置，並且系統中內建的硬體加速器驅動，已透過 HDI 介面對接 Neural Network Runtime。

由於 Neural Network Runtime 透過 Native API 對外開放，需要透過 Native 開發套件編譯 Neural Network Runtime 應用。

環境搭建

1. 開啟 Ubuntu 編譯伺服器的終端。

2. 把下載好的 Native 開發套件壓縮包複製至當前使用者根目錄下。

3. 執行以下命令解壓 Native 開發套件的壓縮包。

unzip native-linux-{版本號}.
zip

解壓縮後的內容如下（隨版本迭代，目錄下的內容可能發生變化，請以最新版本的 Native API 為準）：


native/
├── build 
// 交叉編譯工具鏈
├── build-tools 
// 編譯構建工具
├── docs
├── llvm
├── nativeapi_syscap_config.json
├── ndk_system_capability.json
├── NOTICE.txt
├── oh-uni-
package.json
└── sysroot 
// Native API標頭檔案和庫

介面說明

這裡給出 Neural Network Runtime 開發流程中通用的介面，具體請見下列表格。

結構體

HarmonyOS：Neural Network Runtime 對接 AI 推理框架開發指導

模型構造相關介面

HarmonyOS：Neural Network Runtime 對接 AI 推理框架開發指導

模型編譯相關介面

HarmonyOS：Neural Network Runtime 對接 AI 推理框架開發指導

執行推理相關介面

HarmonyOS：Neural Network Runtime 對接 AI 推理框架開發指導

裝置管理相關介面

HarmonyOS：Neural Network Runtime 對接 AI 推理框架開發指導

開發步驟

Neural Network Runtime 的開發流程主要包含模型構造、模型編譯和推理執行三個階段。以下開發步驟以 Add 單運算元模型為例，介紹呼叫 Neural Network Runtime 介面，開發應用的過程。

1. 建立應用樣例檔案。

首先，建立 Neural Network Runtime 應用樣例的原始檔。在專案目錄下執行以下命令，建立 nnrt_example/目錄，在目錄下建立 nnrt_example.cpp 原始檔。


mkdir ~/nnrt_example && 
cd ~/nnrt_example

touch nnrt_example.cpp

2. 匯入 Neural Network Runtime。

在 nnrt_example.cpp 檔案的開頭新增以下程式碼，引入 Neural Network Runtime 模組。


#
include 
<cstdint>

#
include 
<iostream>

#
include 
<vector>




#
include 
"neural_network_runtime/neural_network_runtime.h"




// 常量，用於指定輸入、輸出資料的位元組長度

const 
size_t DATA_LENGTH = 
4 * 
12;

3. 構造模型。

使用 Neural Network Runtime 介面，構造 Add 單運算元樣例模型。


OH_NN_ReturnCode 
BuildModel
(OH_NNModel** pModel)
{
    
// 建立模型例項，進行模型構造
    OH_NNModel* model = 
OH_NNModel_Construct();
    
if (model == 
nullptr) {
        std::cout << 
"Create model failed." << std::endl;
        
return OH_NN_MEMORY_ERROR;
    }



    
// 新增Add運算元的第一個輸入Tensor，型別為float32，張量形狀為[1, 2, 2, 3]
    
int32_t inputDims[
4] = {
1, 
2, 
2, 
3};
    OH_NN_Tensor input1 = {OH_NN_FLOAT32, 
4, inputDims, 
nullptr, OH_NN_TENSOR};
    OH_NN_ReturnCode ret = 
OH_NNModel_AddTensor(model, &input1);
    
if (ret != OH_NN_SUCCESS) {
        std::cout << 
"BuildModel failed, add Tensor of first input failed." << std::endl;
        
return ret;
    }



    
// 新增Add運算元的第二個輸入Tensor，型別為float32，張量形狀為[1, 2, 2, 3]
    OH_NN_Tensor input2 = {OH_NN_FLOAT32, 
4, inputDims, 
nullptr, OH_NN_TENSOR};
    ret = 
OH_NNModel_AddTensor(model, &input2);
    
if (ret != OH_NN_SUCCESS) {
        std::cout << 
"BuildModel failed, add Tensor of second input failed." << std::endl;
        
return ret;
    }



    
// 新增Add運算元的引數Tensor，該引數Tensor用於指定啟用函式的型別，Tensor的資料型別為int8。
    
int32_t activationDims = 
1;
    
int8_t activationValue = OH_NN_FUSED_NONE;
    OH_NN_Tensor activation = {OH_NN_INT8, 
1, &activationDims, 
nullptr, OH_NN_ADD_ACTIVATIONTYPE};
    ret = 
OH_NNModel_AddTensor(model, &activation);
    
if (ret != OH_NN_SUCCESS) {
        std::cout << 
"BuildModel failed, add Tensor of activation failed." << std::endl;
        
return ret;
    }



    
// 將啟用函式型別設定為OH_NN_FUSED_NONE，表示該運算元不新增啟用函式。
    ret = 
OH_NNModel_SetTensorData(model, 
2, &activationValue, 
sizeof(
int8_t));
    
if (ret != OH_NN_SUCCESS) {
        std::cout << 
"BuildModel failed, set value of activation failed." << std::endl;
        
return ret;
    }



    
// 設定Add運算元的輸出，型別為float32，張量形狀為[1, 2, 2, 3]
    OH_NN_Tensor output = {OH_NN_FLOAT32, 
4, inputDims, 
nullptr, OH_NN_TENSOR};
    ret = 
OH_NNModel_AddTensor(model, &output);
    
if (ret != OH_NN_SUCCESS) {
        std::cout << 
"BuildModel failed, add Tensor of output failed." << std::endl;
        
return ret;
    }



    
// 指定Add運算元的輸入、引數和輸出索引
    
uint32_t inputIndicesValues[
2] = {
0, 
1};
    
uint32_t paramIndicesValues = 
2;
    
uint32_t outputIndicesValues = 
3;
    OH_NN_UInt32Array paramIndices = {&paramIndicesValues, 
1};
    OH_NN_UInt32Array inputIndices = {inputIndicesValues, 
2};
    OH_NN_UInt32Array outputIndices = {&outputIndicesValues, 
1};



    
// 向模型例項新增Add運算元
    ret = 
OH_NNModel_AddOperation(model, OH_NN_OPS_ADD, &paramIndices, &inputIndices, &outputIndices);
    
if (ret != OH_NN_SUCCESS) {
        std::cout << 
"BuildModel failed, add operation failed." << std::endl;
        
return ret;
    }



    
// 設定模型例項的輸入、輸出索引
    ret = 
OH_NNModel_SpecifyInputsAndOutputs(model, &inputIndices, &outputIndices);
    
if (ret != OH_NN_SUCCESS) {
        std::cout << 
"BuildModel failed, specify inputs and outputs failed." << std::endl;
        
return ret;
    }



    
// 完成模型例項的構建
    ret = 
OH_NNModel_Finish(model);
    
if (ret != OH_NN_SUCCESS) {
        std::cout << 
"BuildModel failed, error happened when finishing model construction." << std::endl;
        
return ret;
    }



    *pModel = model;
    
return OH_NN_SUCCESS;
}

4. 查詢 Neural Network Runtime 已經對接的加速晶片。

Neural Network Runtime 支援透過 HDI 介面，對接多種加速晶片。在執行模型編譯前，需要查詢當前裝置下，Neural Network Runtime 已經對接的加速晶片。每個加速晶片對應唯|一的 ID 值，在編譯階段需要透過裝置 ID，指定模型編譯的晶片。



void 
GetAvailableDevices
(std::vector<
size_t>& availableDevice)
{
    availableDevice.
clear();



    
// 獲取可用的硬體ID
    
const 
size_t* devices = 
nullptr;
    
uint32_t deviceCount = 
0;
    OH_NN_ReturnCode ret = 
OH_NNDevice_GetAllDevicesID(&devices, &deviceCount);
    
if (ret != OH_NN_SUCCESS) {
        std::cout << 
"GetAllDevicesID failed, get no available device." << std::endl;
        
return;
    }



    
for (
uint32_t i = 
0; i < deviceCount; i++) {
        availableDevice.
emplace_back(devices[i]);
    }
}

5. 在指定的裝置上編譯模型。

Neural Network Runtime 使用抽象的模型表達描述 AI 模型的拓撲結構，在加速晶片上執行前，需要透過 Neural Network Runtime 提供的編譯模組，將抽象的模型表達下發至晶片驅動層，轉換成可以直接推理計算的格式。


OH_NN_ReturnCode 
CreateCompilation
(OH_NNModel* model, 
const std::vector<
size_t>& availableDevice, OH_NNCompilation** pCompilation)
{
    
// 建立編譯例項，用於將模型傳遞至底層硬體編譯
    OH_NNCompilation* compilation = 
OH_NNCompilation_Construct(model);
    
if (compilation == 
nullptr) {
        std::cout << 
"CreateCompilation failed, error happened when creating compilation." << std::endl;
        
return OH_NN_MEMORY_ERROR;
    }



    
// 設定編譯的硬體、快取路徑、效能模式、計算優先順序、是否開啟float16低精度計算等選項



    
// 選擇在第一個裝置上編譯模型
    OH_NN_ReturnCode ret = 
OH_NNCompilation_SetDevice(compilation, availableDevice[
0]);
    
if (ret != OH_NN_SUCCESS) {
        std::cout << 
"CreateCompilation failed, error happened when setting device." << std::endl;
        
return ret;
    }



    
// 將模型編譯結果快取在/data/local/tmp目錄下，版本號指定為1
    ret = 
OH_NNCompilation_SetCache(compilation, 
"/data/local/tmp", 
1);
    
if (ret != OH_NN_SUCCESS) {
        std::cout << 
"CreateCompilation failed, error happened when setting cache path." << std::endl;
        
return ret;
    }



    
// 完成編譯設定，進行模型編譯
    ret = 
OH_NNCompilation_Build(compilation);
    
if (ret != OH_NN_SUCCESS) {
        std::cout << 
"CreateCompilation failed, error happened when building compilation." << std::endl;
        
return ret;
    }



    *pCompilation = compilation;
    
return OH_NN_SUCCESS;
}

6. 建立執行器。

完成模型編譯後，需要呼叫 Neural Network Runtime 的執行模組，建立推理執行器。執行階段，設定模型輸入、獲取模型輸出和觸發推理計算的操作均圍繞執行器完成。

OH_NNExecutor* CreateExecutor(OH_NNCompilation* compilation)
{
    
// 建立執行例項
    OH_NNExecutor* executor = OH_NNExecutor_Construct(compilation);
    
return executor;
}

7. 執行推理計算，並列印計算結果。

透過執行模組提供的介面，將推理計算所需要的輸入資料傳遞給執行器，觸發執行器完成一次推理計算，獲取模型的推理計算結果。


OH_NN_ReturnCode 
Run
(OH_NNExecutor* executor)
{
    
// 構造示例資料
    
float input1[
12] = {
0, 
1, 
2, 
3, 
4, 
5, 
6, 
7, 
8, 
9, 
10, 
11};
    
float input2[
12] = {
11, 
12, 
13, 
14, 
15, 
16, 
17, 
18, 
19, 
20, 
21, 
22};



    
int32_t inputDims[
4] = {
1, 
2, 
2, 
3};
    OH_NN_Tensor inputTensor1 = {OH_NN_FLOAT32, 
4, inputDims, 
nullptr, OH_NN_TENSOR};
    OH_NN_Tensor inputTensor2 = {OH_NN_FLOAT32, 
4, inputDims, 
nullptr, OH_NN_TENSOR};



    
// 設定執行的輸入



    
// 設定執行的第一個輸入，輸入資料由input1指定
    OH_NN_ReturnCode ret = 
OH_NNExecutor_SetInput(executor, 
0, &inputTensor1, input1, DATA_LENGTH);
    
if (ret != OH_NN_SUCCESS) {
        std::cout << 
"Run failed, error happened when setting first input." << std::endl;
        
return ret;
    }



    
// 設定執行的第二個輸入，輸入資料由input2指定
    ret = 
OH_NNExecutor_SetInput(executor, 
1, &inputTensor2, input2, DATA_LENGTH);
    
if (ret != OH_NN_SUCCESS) {
        std::cout << 
"Run failed, error happened when setting second input." << std::endl;
        
return ret;
    }



    
// 設定輸出的資料緩衝區，OH_NNExecutor_Run執行計算後，輸出結果將保留在output中
    
float output[
12];
    ret = 
OH_NNExecutor_SetOutput(executor, 
0, output, DATA_LENGTH);
    
if (ret != OH_NN_SUCCESS) {
        std::cout << 
"Run failed, error happened when setting output buffer." << std::endl;
        
return ret;
    }



    
// 執行計算
    ret = 
OH_NNExecutor_Run(executor);
    
if (ret != OH_NN_SUCCESS) {
        std::cout << 
"Run failed, error doing execution." << std::endl;
        
return ret;
    }



    
// 列印輸出結果
    
for (
uint32_t i = 
0; i < 
12; i++) {
        std::cout << 
"Output index: " << i << 
", value is: " << output[i] << 
"." << std::endl;
    }



    
return OH_NN_SUCCESS;
}

8. 構建端到端模型構造-編譯-執行流程。

步驟 3-步驟 7 實現了模型的模型構造、編譯和執行流程，並封裝成 4 個函式，便於模組化開發。以下示例程式碼將 4 個函式串聯成完整的 Neural Network Runtime 開發流程。



int 
main
()
{
    OH_NNModel* model = 
nullptr;
    OH_NNCompilation* compilation = 
nullptr;
    OH_NNExecutor* executor = 
nullptr;
    std::vector<
size_t> availableDevices;



    
// 模型構造階段
    OH_NN_ReturnCode ret = 
BuildModel(&model);
    
if (ret != OH_NN_SUCCESS) {
        std::cout << 
"BuildModel failed." << std::endl;
        
OH_NNModel_Destroy(&model);
        
return 
-1;
    }



    
// 獲取可執行的裝置
    
GetAvailableDevices(availableDevices);
    
if (availableDevices.
empty()) {
        std::cout << 
"No available device." << std::endl;
        
OH_NNModel_Destroy(&model);
        
return 
-1;
    }



    
// 模型編譯階段
    ret = 
CreateCompilation(model, availableDevices, &compilation);
    
if (ret != OH_NN_SUCCESS) {
        std::cout << 
"CreateCompilation failed." << std::endl;
        
OH_NNModel_Destroy(&model);
        
OH_NNCompilation_Destroy(&compilation);
        
return 
-1;
    }



    
// 建立模型的推理執行器
    executor = 
CreateExecutor(compilation);
    
if (executor == 
nullptr) {
        std::cout << 
"CreateExecutor failed, no executor is created." << std::endl;
        
OH_NNModel_Destroy(&model);
        
OH_NNCompilation_Destroy(&compilation);
        
return 
-1;
    }



    
// 使用上一步建立的執行器，執行單步推理計算
    ret = 
Run(executor);
    
if (ret != OH_NN_SUCCESS) {
        std::cout << 
"Run failed." << std::endl;
        
OH_NNModel_Destroy(&model);
        
OH_NNCompilation_Destroy(&compilation);
        
OH_NNExecutor_Destroy(&executor);
        
return 
-1;
    }



    
// 釋放申請的資源
    
OH_NNModel_Destroy(&model);
    
OH_NNCompilation_Destroy(&compilation);
    
OH_NNExecutor_Destroy(&executor);



    
return 
0;
}

調測驗證

1. 準備應用樣例的編譯配置檔案。

新建一個 CMakeLists.txt 檔案，為開發步驟中的應用樣例檔案 nnrt_example.cpp 新增編譯配置。以下提供簡單的 CMakeLists.txt 示例：

cmake_minimum_required(VERSION 3.16)
project(nnrt_example C CXX)



add_executable(nnrt_example
    ./nnrt_example.cpp
)



target_link_libraries(nnrt_example
    neural_network_runtime.z
)

2. 編譯應用樣例。

執行以下命令，在當前目錄下新建 build/目錄，在 build/目錄下編譯 nnrt_example.cpp，得到二進位制檔案 nnrt_example。


mkdir build && 
cd build
cmake -DCMAKE_TOOLCHAIN_FILE={交叉編譯工具鏈的路徑}/build/cmake/ohos.toolchain.cmake -DOHOS_ARCH=arm64-v8a -DOHOS_PLATFORM=OHOS -DOHOS_STL=c++_static ..
make

3. 執行以下程式碼，將樣例推送到裝置上執行。


# 將編譯得到的 `nnrt_example` 推送到裝置上，執行樣例。
hdc_std file send ./nnrt_example /data/local/tmp/.




# 給測試用例可執行檔案加上許可權。
hdc_std shell 
"chmod +x /data/local/tmp/nnrt_example"




# 執行測試用例
hdc_std shell 
"/data/local/tmp/nnrt_example"

如果樣例執行正常，應該得到以下輸出。

Output index: 
0, value 
is: 
11.000000.
Output index: 
1, value 
is: 
13.000000.
Output index: 
2, value 
is: 
15.000000.
Output index: 
3, value 
is: 
17.000000.
Output index: 
4, value 
is: 
19.000000.
Output index: 
5, value 
is: 
21.000000.
Output index: 
6, value 
is: 
23.000000.
Output index: 
7, value 
is: 
25.000000.
Output index: 
8, value 
is: 
27.000000.
Output index: 
9, value 
is: 
29.000000.
Output index: 
10, value 
is: 
31.000000.
Output index: 
11, value 
is: 
33.000000.

4. 檢查模型快取（可選）。

如果在調測環境下，Neural Network Runtime 對接的 HDI 服務支援模型快取功能，執行完 nnrt_example, 可以在 /data/local/tmp 目錄下找到生成的快取檔案。

說明

模型的 IR 需要傳遞到硬體驅動層，由 HDI 服務將統一的 IR 圖，編譯成硬體專用的計算圖，編譯的過程非常耗時。Neural Network Runtime 支援計算圖快取的特性，可以將 HDI 服務編譯生成的計算圖，快取到裝置儲存中。當下一次在同一個加速晶片上編譯同一個模型時，透過指定快取的路徑，Neural Network Runtime 可以直接載入快取檔案中的計算圖，減少編譯消耗的時間。

檢查快取目錄下的快取檔案：


ls /data/local/tmp

以下為列印結果：


# 0.nncache  cache_info.nncache

如果快取不再使用，需要手動刪除快取，可以參考以下命令，刪除快取檔案。


rm /data/local/tmp/*nncache

HarmonyOS：Neural Network Runtime 對接 AI 推理框架開發指導

場景介紹

環境準備

環境要求

環境搭建

介面說明

結構體

模型構造相關介面

模型編譯相關介面

執行推理相關介面

裝置管理相關介面

開發步驟

調測驗證

相關文章