Ubuntu18.04 LTS 使用CUDA11.1編譯TensoFlow-GPU版本

SameWorld發表於2020-11-16

原文網址 : https://blog.csdn.net/baidu_26678247/article/details/109727892

Ubuntu編譯GPU

Ubuntu18.04 LTS 使用CUDA11.1編譯TensoFlow-GPU版本

谷歌官方為python3.8編譯的tensorflow-gpu 2.3版本只支援cuda 10.1和cudnn 7，為了支援cuda 11.1和cudnn 8，需要重新編譯tensorflow。

1、編譯準備

本次編譯環境如下：

作業系統：Ubuntu 18.04 TSL
Python版本：3.6
CPU：Intel i5 / intel i10（和編譯速度相關）
顯示卡：GTX 1050TI / RTX 3080
記憶體：64G

博主在i5、1050TI和i10、3080環境下均編譯通過。

在編譯前，需要確保主機已安裝顯示卡驅動、cuda 11.1、cudnn 8、tensorrt 7.2，版本可根據實際情況選擇，根據實際需要而定，cuda10好像不支援RTX 3000系列顯示卡，cuda 11及以上才支援RTX 3000系列顯示卡。

本教程是通過ssh遠端連線主機進行編譯，博主也建議通過此種方式，如何通過ssh遠端連線主機可以在網上查詢相關資料。

2、基礎環境安裝

如果已經安裝顯示卡驅動、cuda 11.1、cudnn 8、tensorrt 7.2，則可以跳過此步驟。

2.1、下載檔案

下載顯示卡驅動

顯示卡驅動從英偉達官網獲取：英偉達官網（點選跳轉），根據顯示卡型號選擇資訊，然後點選【開始搜尋】：

在這裡插入圖片描述

然後在搜尋下方回有搜尋結果，選擇最新版本，點選【獲取下載】進入下載介面，下載該版本驅動（由於內地訪訪問官網較慢，所以點選按鈕按鈕可能需要一定才能反應，建議耐心等待）：

在這裡插入圖片描述

下載 CUDA

進入英偉達的cuda下載頁面：cuda下載頁面（點選跳轉），根據實際情況選擇cuda版本，博主使用的是11.1.1版本，點選對應版本號，進入下載頁面：

在這裡插入圖片描述

博主在Ubuntu 18.04下編譯，所以依次選擇【Linux】》【x86_64】》【Ubuntu】》【18.04】》【runfile(local)】（需要根據實際情況選擇），依次點選後，會在下方出現下載步驟，按照步驟操作即可：

在這裡插入圖片描述

在命令列下輸入命令第一句（下載cuda安裝檔案），第二句為安裝指令，暫時只需要使用第一句下載就行：

wget https://developer.download.nvidia.com/compute/cuda/11.1.1/local_installers/cuda_11.1.1_455.32.00_linux.run

如果沒有安裝wget，需要使用sudo apt install wget安裝。

下載cuDNN

進入cudnn下載頁面：cudnn下載頁面（點選跳轉）；cudnn下載需要註冊賬戶，如果已有賬戶直接登入，沒有則按照步驟註冊，登入以後會跳轉至下載頁面（如果載入緩慢，耐心等待；無法正常載入，重新整理網頁）。

勾選【I Agree To the Terms of the cuDNN Software License Agreement】，會出現各個下載版本：

在這裡插入圖片描述

【注意】根據cuda版本選擇，博主選擇的是當時最新版本cudnn v8.0.5，對應的cuda版本為11.1。

選擇好cuDNN版本後，點選下載“cuDNN Library for Linux”，如圖所示：

在這裡插入圖片描述

下載完成後，會獲取一個壓縮包。

下載TensorRT

進入tensorrt下載頁面：tensorrt下載頁面（點選跳轉），點選【Download Now】，同樣也需要登入，在上一步已經登入的話，會直接跳轉至下載介面，點選【TensorRT 7】（或者其他版本）：

在這裡插入圖片描述

勾選【I Agree To the Terms of the NVIDIA TensorRT License Agreement】，選擇詳細版本，博主選擇的是當時最新版本7.2.1，然後根據Linux系統版本和cuda版本選擇tensorrt版本，博主選擇的為【TensorRT 7.2.1 for Ubuntu 18.04 and CUDA 11.1 TAR package】，一定要下載TAR型別檔案，後續教程使用該型別檔案安裝，不同型別安裝方法不一致：

在這裡插入圖片描述

下載完成後，會得到一個壓縮包型別檔案。

檔案彙總

可以新建一個資料夾，將上述下載的檔案彙總在一起，便於操作，博主在當前使用者根目錄下建立了一個名為install的資料夾，將下載的檔案全部放置在該目錄下，資料夾內容如下（如果下述檔案在windows系統下下載，可以使用ftp或者U盤將檔案複製至Ubuntu系統）：

$ ls -l
total 6135424
-rwxrwxr-x 1 sworld sworld 3498245611 Nov 14 12:29 cuda_11.1.0_455.23.05_linux.run
-rw-rw-r-- 1 sworld sworld 1548325637 Nov 14 04:36 cudnn-11.1-linux-x64-v8.0.5.39.tgz
-rwxrwxr-x 1 sworld sworld  168953614 Oct 24 13:13 NVIDIA-Linux-x86_64-455.28.run
-rw-r--r-- 1 root   root   1024005281 Nov 16 07:01 TensorRT-7.2.1.6.Ubuntu-18.04.x86_64-gnu.cuda-11.1.cudnn8.0.tar.gz

2.2、安裝顯示卡驅動

cuda安裝檔案帶有顯示卡驅動安裝項，但可能不是最新版本，如果想安裝cuda的時候安裝驅動，可以跳過這一步，直接進入cuda安裝步驟。

在安裝顯示卡需要檢查BIOS，關閉一些選項：

在開機啟動項的Security選項中檢查UEFI是否開啟，如果開啟的話請立馬關掉它（重要）
在開機啟動項的Boot選項中檢查Secure Boot是否開啟，如果開啟的話請立馬關掉它（重要）

安裝下述相關依賴：

sudo apt-get install libprotobuf-dev libleveldb-dev libsnappy-dev libopencv-dev libhdf5-serial-dev protobuf-compiler
sudo apt-get install --no-install-recommends libboost-all-dev
sudo apt-get install libopenblas-dev liblapack-dev libatlas-base-dev
sudo apt-get install libgflags-dev libgoogle-glog-dev liblmdb-dev

如果有圖形化介面，需要禁用x-window服務：

sudo service lightdm stop
或
sudo /etc/init.d/lightdm stop

還需要禁用nouveau，禁用方法是編輯/etc/modprobe.d/blacklist.conf，在文末新增下述內容：

blacklist nouveau
blacklist lbm-nouveau
options nouveau modeset=0
alias nouveau off
alias lbm-nouveau off

最後更新後重啟電腦：

sudo update-initramfs -u
reboot

安裝驅動

新增可執行許可權並安裝驅動：

sudo chmod a+x NVIDIA-Linux-x86_64-455.38.run
sudo ./NVIDIA-Linux-x86_64-455.38.run

根據引導安裝驅動，其中【NVIDIA-Linux-x86_64-455.38.run】為驅動檔案。

檢視驅動

安裝完成後，可以在命令列下使用nvidia-smi檢視顯示卡使用情況，出現如下內容，說明顯示卡驅動已經安裝：

Sun Nov 15 19:15:50 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 455.38       Driver Version: 455.38       CUDA Version: 11.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce RTX 3080    Off  | 00000000:17:00.0 Off |                  N/A |
|  0%   52C    P8    11W / 320W |      5MiB / 10018MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  GeForce RTX 3080    Off  | 00000000:65:00.0  On |                  N/A |
|  0%   49C    P8    11W / 320W |    173MiB / 10014MiB |      1%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1139      G   /usr/lib/xorg/Xorg                  4MiB |
|    0   N/A  N/A      1338      G   /usr/bin/gnome-shell                0MiB |
|    1   N/A  N/A      1139      G   /usr/lib/xorg/Xorg                106MiB |
|    1   N/A  N/A      1338      G   /usr/bin/gnome-shell               64MiB |
+-----------------------------------------------------------------------------+

解除安裝舊驅動

如果需要重新安裝顯示卡驅動，可以使用下述命令：

sudo apt-get remove --purge nvidia*
sudo apt-get autoremove
sudo chmod +x NVIDIA-Linux-x86_64xx.xx.run
sudo ./NVIDIA-Linux-x86_64xx.xx.run --uninstall

其中，NVIDIA-Linux-x86_64xx.xx.run是安裝的舊顯示卡驅動。

2.3、安裝CUDA

安裝CUDA

新增可執行許可權並執行安裝程式：

sudo chmod a+x cuda_11.1.0_455.23.05_linux.run
sudo ./cuda_11.1.0_455.23.05_linux.run

然後回出現下述介面，輸入accept，然後回車：

x  End User License Agreement                                                  x
x  -                                                                           x
x  NVIDIA Software License Agreement and CUDA Supplement to                    x
x  Software License Agreement.                                                 x
x                                                                              x
x  Preface                                                                     x
x  -                                                                           x
x  The Software License Agreement in Chapter 1 and the Supplement              x
x  in Chapter 2 contain license terms and conditions that govern               x
x  the use of NVIDIA software. By accepting this agreement, you                x
x  agree to comply with all the terms and conditions applicable                x
x  to the product(s) included herein.                                          x
x                                                                              x
x  NVIDIA Driver                                                               x
                                                                               x
x Do you accept the above EULA? (accept/decline/quit):                         x
x                                                                              x

進入選擇介面，如果沒有安裝顯示卡驅動，直接使用上下按鍵選擇【Install】，按回車開始安裝即可。

如果已經安裝驅動，使用上下按鍵選擇【Driver】，使用空格取消選擇，然後選擇【Install】，等待cuda安裝完成。

x CUDA Installer se Agreement                                                  x
x - [ ] Driver                                                                 x
x      [ ] 455.23.05                                                           x
x + [X] CUDA Toolkit 11.1                                                      x
x   [X] CUDA Samples 11.1                                                      x
x   [X] CUDA Demo Suite 11.1                                                   x
x   [X] CUDA Documentation 11.1                                                x
x   Options                                                                    x
x   Install                                                                    x
x                                                                              x
x   VIDIA Driver                                                               x
x                                                                              
x                                                                              x
x Up/Down: Move | Left/Right: Expand | 'Enter': Select | 'A': Advanced options x

安裝完成後，需要配置環境變數，編輯~/.bashrc檔案，如果其他使用者需要使用cuda，再按照上述步驟新增環境變數並更新即可：

sudo nano ~/.bashrc

在末尾新增：

export CUDA_HOME=/usr/local/cuda
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${CUDA_HOME}/lib64
export PATH=${CUDA_HOME}/bin:${PATH}

新增後更新環境變數：

source ~/.bashrc

cuda安裝目錄為/usr/loca/cuda-xxx，xxx為版本號，同時，cuda還會建立一個/usr/local/cuda同步連結，所以可以直接將該路徑新增至環境變數，之後更換cuda版本，也不需要修改環境變數。

檢視CUDA

新增環境變數以後，使用命令nvcc -V可以檢視cuda安裝情況：

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Tue_Sep_15_19:10:02_PDT_2020
Cuda compilation tools, release 11.1, V11.1.74
Build cuda_11.1.TC455_06.29069683_0

出現上述內容，說明安裝成功，可以看到博主的cuda版本為11.1。

解除安裝CUDA

解除安裝cuda需要進入cuda的程式目錄：

cd /usr/local/cuda/bin

然後執行：

sudo ./cuda-uninstaller

使用空格選取全部選項，然後按下回車解除安裝：

x CUDA Uninstaller                                                             x
x   [X] CUDA_Samples_11.1                                                      x
x   [X] CUDA_Demo_Suite_11.1                                                   x
x   [X] CUDA_Documentation_11.1                                                x
x   [X] CUDA_Toolkit_11.1                                                      x
x   Done                                                                       x
x                                                                              x
x                                                                              x
x                                                                              x
x Up/Down: Move | 'Enter': Select                                              x

解除安裝完成後，使用下述命令刪除殘留項：

sudo rm -R /usr/local/cuda*

2.4、安裝CUDNN

解壓下載得到的cudnn壓縮檔案：

sudo tar -zxvf cudnn-11.1-linux-x64-v8.0.5.39.tgz

解壓完成後，在該資料夾下會出現一個叫cuda的資料夾，該檔案下包含cudnn相關項，使用下述命令複製檔案至cuda目錄：

sudo cp cuda/lib64/* /usr/local/cuda/lib64/
sudo  cp cuda/include/* /usr/local/cuda/include/
sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*

安裝完成後，可以使用下述命令檢視cudnn版本資訊：

cat /usr/local/cuda/include/cudnn_version.h | grep CUDNN_MAJOR -A 2

2.5 安裝TensorRT

安裝TensorRT

cudnn預設安裝路徑為/usr/local/，所以為了統一管理，將tensorrt也安裝在此處，將tensorrt壓縮檔案移動至該路徑（注意最後的/，檔名稱根據你實際下載名稱而定）：

sudo mv TensorRT-7.2.1.6.Ubuntu-18.04.x86_64-gnu.cuda-11.1.cudnn8.0.tar.gz /usr/local/

解壓該資料夾，解壓完成會在同級目錄生成一個TensorRT-x.x.x.x資料夾：

sudo tar -zxvf TensorRT-7.2.1.6.Ubuntu-18.04.x86_64-gnu.cuda-11.1.cudnn8.0.tar.gz

建立同步連結：

sudo ln -s /usr/local/TensorRT-7.2.1.6 /usr/local/tensorrt

新增環境變數，編輯~/.bashrc檔案，如果其他使用者需要使用tensorrt，再按照下述步驟新增環境變數並更新即可：

sudo nano ~/.bashrc

在文末新增下述內容：

export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/usr/local/tensorrt/lib

更新環境變數：

source ~/.bashrc

安裝Python介面

如果需要用 Python API 進行程式設計，需要安裝下述庫，需要根據Python版本選擇whl檔案，博主Python版本為3.6：

cd /usr/local/tensorrt/python
sudo pip3 install tensorrt-7.2.1.6-cp36-none-linux_x86_64.whl

安裝UFF轉換庫

安裝該資料夾下的whl檔案：

cd /usr/local/tensorrt/uff
sudo pip3 install uff-0.6.9-py2.py3-none-any.whl

安裝graphsurgeon

graphsurgeon 是對UFF編碼網路進行定製化操作的庫，比如插入或刪除神經網路某一層layer，安裝該資料夾下的whl檔案:

cd /usr/local/tensorrt/graphsurgeon
sudo pip3 install graphsurgeon-0.4.5-py2.py3-none-any.whl

驗證安裝

檢視TensorRT的安裝目錄下檔案是否齊全，可使用命令tree -d，會看到包含以下資料夾：lib，include，data…
執行例子sampleMNIST

# 進入目錄檔案
cd /usr/local/tensorrt/samples/sampleMNIST
# 編譯
sudo make
# 執行
./usr/local/tensorrt/bin/sample_mnist

如果上述命令無ERROR輸出,則證明安裝成功。

3、編譯Tensorflow-GPU

TensorFlow使用Bazel編譯。在編譯之前，先整理一下已經配置好的環境：

CUDA 11.1
cuDNN 8.0.5
TensorRT 7.2.1

3.1、安裝Bazel

【注意】：到目前為止，需要Bazel 3.1.0來編譯TensorFlow，如果編譯時提示更改Bazel版本，只需要按照命令執行即可。

進入Bazel Release下載頁面：Bazel releases page on GitHub，找到版本號為3.1.0的Bazel執行檔案，點選【Assets】：

在這裡插入圖片描述

選擇名稱為bazel-x.x.x-installer-linux-x86_64.sh下載，其中x.x.x為版本號，右鍵複製連結，然後使用wget下載：

wget https://github.com/bazelbuild/bazel/releases/download/3.1.0/bazel-3.1.0-installer-linux-x86_64.sh

新增可執行許可權並執行：

sudo chmod a+x bazel-3.1.0-installer-linux-x86_64.sh
./bazel-3.1.0-installer-linux-x86_64.sh --user

新增環境變數，如果更換其他使用者，同樣需要執行下述新增環境變數操作：

編輯~/.bashrc：

sudo nano ~/.bashrc

在文末新增下述內容：

export BAZEL_HOME=/home/sworld
export PATH=${PATH}:${BAZEL_HOME}/bin

【注意】：其中BAZEL_HOME為當前使用者根目錄，由於博主當前使用者為sworld，所以使用者根目錄為/home/sworld。

使環境變數生效：

source ~/.bashrc

最後使用bazel命令檢視是否安裝成功，如果出現下述內容，說明安裝成功：

$ bazel
                                                           [bazel release 3.1.0]
Usage: bazel <command> <options> ...

Available commands:
  analyze-profile     Analyzes build profile data.
  aquery              Analyzes the given targets and queries the action graph.
#...省略

3.2、編譯準備

獲取原始碼

訪問tensorflow專案的github地址，使用git將專案clone至本地（Linux主機），如果沒有安裝git，使用下述命令安裝：

sudo apt install git

clone專案，進入專案資料夾：

# 官方地址克隆
git clone https://github.com/tensorflow/tensorflow.git
# 加速通道克隆
git clone https://hub.fastgit.org/tensorflow/tensorflow.git
# 建議使用加速通道克隆，在國內使用官方地址克隆，速度較慢
# 進入專案資料夾
cd tesorflow

配置編譯選項

使用下述指定，開始配置編譯選項：

./configure

第一步，選擇python版本，預設python3，直接回車即可：

You have bazel 3.1.0 installed.
Please specify the location of python. [Default is /usr/bin/python3]:

第二步，選擇python庫路徑，直接回車即可：

Found possible Python library paths:
  /usr/lib/python3/dist-packages
  /usr/local/lib/python3.6/dist-packages
Please input the desired Python library path to use.  Default is [/usr/lib/python3/dist-packages]

第三步，是否支援ROCm，輸入N，回車：

Do you wish to build TensorFlow with ROCm support? [y/N]: N
No ROCm support will be enabled for TensorFlow.

第四步，是否支援cuda，也就是是否支援GPU，輸入y，回車：

Do you wish to build TensorFlow with CUDA support? [y/N]: y
CUDA support will be enabled for TensorFlow.

第五步，是否支援tensorrt，輸入y，回車：

Do you wish to build TensorFlow with TensorRT support? [y/N]: y
TensorRT support will be enabled for TensorFlow.

第六步，開始配置cuda版本，同時可能回出現找不到相關檔案，這塊可以忽略，由於使用的cuda版本為11.1，所以在此處輸入11.1，回車：

Could not find any NvInferVersion.h matching version '' in any subdirectory:
        ''
        'include'
        'include/cuda'
        'include/*-linux-gnu'
        'extras/CUPTI/include'
        'include/cuda/CUPTI'
        'local/cuda/extras/CUPTI/include'
of:
        '/lib'
        '/lib/x86_64-linux-gnu'
        '/lib32'
        '/libx32'
        '/usr'
        '/usr/lib'
        '/usr/lib/x86_64-linux-gnu'
        '/usr/lib/x86_64-linux-gnu/libfakeroot'
        '/usr/lib32'
        '/usr/libx32'
        '/usr/local/cuda'
        '/usr/local/cuda-10.1/targets/x86_64-linux/lib'
Asking for detailed CUDA configuration...

Please specify the CUDA SDK version you want to use. [Leave empty to default to CUDA 10]: 11.1

第七步，輸入cudnn版本，使用的cudnn版本為8.0.5，可以直接輸入8，回車：

Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 7]: 8

第八步，輸入tensorrt版本，使用的tensorrt版本為7.2.1，可以直接輸入7，回車：

Please specify the TensorRT version you want to use. [Leave empty to default to TensorRT 6]: 7

第九步，輸入nccl版本，由於沒有配置nccl，所以直接回車，使用預設的即可：

Please specify the locally installed NCCL version you want to use. [Leave empty to use http://github.com/nvidia/nccl]:

第十步，關鍵的一步，需要輸入cuda、cudnn、tensorrt的安裝路徑，如果按照上述教程安裝的cuda、tensorrt等，則可以直接輸入下述內容，否則需要將下述的cuda、tensorrt路徑改為實際安裝路徑：

/usr/local/tensorrt$ /lib,/lib/x86_64-linux-gnu,/usr,/usr/lib/x86_64-linux-gnu/libfakeroot,/usr/local/cuda,/usr/local/cuda/targets/x86_64-linux/lib,/usr/local/tensorrt

輸入上述內容，回車：

Please specify the comma-separated list of base paths to look for CUDA libraries and headers. [Leave empty to use the default]: /usr/local/tensorrt$ /lib,/lib/x86_64-linux-gnu,/usr,/usr/lib/x86_64-linux-gnu/libfakeroot,/usr/local/cuda,/usr/local/cuda/targets/x86_64-linux/lib,/usr/local/tensorrt


Found CUDA 11.1 in:
    /usr/local/cuda-11.1/targets/x86_64-linux/lib
    /usr/local/cuda-11.1/targets/x86_64-linux/include
Found cuDNN 8 in:
    /usr/local/cuda-11.1/targets/x86_64-linux/lib
    /usr/local/cuda-11.1/targets/x86_64-linux/include
Found TensorRT 7 in:
    /usr/local/TensorRT-7.2.1.6/targets/x86_64-linux-gnu/lib
    /usr/local/TensorRT-7.2.1.6/include

第十一步，選擇算力等級，算力等級和顯示卡相關，顯示卡效能越好，則算力等級越高，可通過該網址檢視顯示卡算力等級：算力等級。

博主使用的顯示卡為RTX 3080，通過查詢，算力等級為8.6，所以輸入8.6，回車：

Please specify a list of comma-separated CUDA compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus. Each capability can be specified as "x.y" or "compute_xy" to include both virtual and binary GPU code, or as "sm_xy" to only include the binary code.
Please note that each additional compute capability significantly increases your build time and binary size, and that TensorFlow only supports compute capabilities >= 3.5 [Default is: 3.5,7.0]: 8.6

第十二步，是否使用clang作為cuda編譯器，輸入N，回車：

Do you want to use clang as CUDA compiler? [y/N]: N
nvcc will be used as CUDA compiler.

第十三步，指定nvcc應該使用哪個gcc作為主機編譯器，預設即可，回車：

Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]:

第十四步，當bazel選項——config=opt被指定時，請指定編譯期間使用的優化標誌，對於編譯優化標記，預設值 (-march=native) 會優化針對計算機的 CPU 型別生成的程式碼。但是，如果要針對不同型別的 CPU 構建 TensorFlow，請考慮指定一個更加具體的優化標記。請參閱 GCC 手冊檢視示例。

預設即可，回車：

Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native -Wno-sign-compare]:

第十五步，安卓構建相關，輸入N，回車：

Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: N
Not configuring the WORKSPACE for Android builds

到此配置完成，然後會彈出下述編譯幫助選項：

Preconfigured Bazel build configs. You can use any of the below by adding "--config=<>" to your build command. See .bazelrc for more details.
	--config=mkl         	# Build with MKL support.
	--config=mkl_aarch64 	# Build with oneDNN support for Aarch64.
	--config=monolithic  	# Config for mostly static monolithic build.
	--config=numa        	# Build with NUMA support.
	--config=dynamic_kernels	# (Experimental) Build kernels into separate shared objects.
	--config=v2          	# Build TensorFlow 2.x instead of 1.x.
Preconfigured Bazel build configs to DISABLE default on features:
	--config=noaws       	# Disable AWS S3 filesystem support.
	--config=nogcp       	# Disable GCP support.
	--config=nohdfs      	# Disable HDFS support.
	--config=nonccl      	# Disable NVIDIA NCCL support.
Configuration finished

3.3、開始編譯

您可以將一些預先配置好的構建配置新增到 bazel build 命令中，例如：

--config=mkl：支援 Intel® MKL-DNN。
--config=monolithic：此配置適用於基本保持靜態的單體 build。
--config=v1：用於構建 TensorFlow 1.x，而不是 2.x。

【注意】：從 TensorFlow 1.6 開始，二進位制檔案使用 AVX 指令，這些指令可能無法在舊版 CPU 上執行。

GitHub加速配置

如果在github上克隆專案速度較快，可以直接跳過這一步，但是在國內一般都快不起來。

github加速方式有很多種，大家可以參考這篇文章：github加速。

本文是利用 Cloudflare Workers進行檔案加速，具體方法就是在github連結之前新增一個加速網址，如：

# 在下載連結之前新增 https://g.ioiox.com/，博主實測該連結有效，但是不保證長期有效，大家也可以更換為實測可用的加速網址
git clone https://g.ioiox.com/https://github.com/stilleshan/ServerStatus.git
wget https://g.ioiox.com/https://github.com/stilleshan/ServerStatus/archive/master.zip
wget https://g.ioiox.com/https://raw.githubusercontent.com/stilleshan/ServerStatus/master/Dockerfile

由於編譯tensorflow，需要從github上下載部分檔案，所以需要修改tensorflow工程中的部分連結地址，達到加速的目的：

第一步，修改tensorflow目錄下的WORKSPACE檔案，博主使用的是nano編輯器，大家可自行選擇，但是需要掌握使用編輯器搜尋關鍵詞：

sudo nano WORKSPACE

nano下使用Ctrl+W搜尋

https://github.com/

找到https://github.com/bazelbuild/rules_closure/archive/...，在連結前新增：

https://g.ioiox.com/

然後儲存。

第二步，修改tensorflow目錄下的tensorflow/workspace.bzl檔案：

sudo nano tensorflow/workspace.bzl

同樣搜尋所有https://github.com/，在連結前新增：

https://g.ioiox.com/

然後儲存，有大概50處需要修改。

編譯過程中，如果下載某個檔案卡住，可以搜尋整個專案，找到對應連結，然後使用上述方法修改，除了搜尋整個專案外，還可以搜尋使用者根目錄下的.cache/bazel資料夾，這是一個隱藏資料夾。可以使用ls -a檢視，然後修改相關連結，實現加速。

除了在連結之前新增加速連結，也可以將

https://github.com/

改為：

https://hub.fastgit.org/

構建pip軟體包

使用下述指令構建：

bazel build --config=opt --config=cuda //tensorflow/tools/pip_package:build_pip_package

輸入上述指令後，首先會下載相關依賴項，然後開始編譯，下載依賴項和網速相關，編譯速度和主機效能相關。

編譯時間大概為2個小時及以上，耐心等待或者幹些其他事情：

在這裡插入圖片描述

下述是構建指令相關示例：

TensorFlow 2.x

tensorflow:master 程式碼庫已經預設更新為 build 2.x。

bazel build //tensorflow/tools/pip_package:build_pip_package

【注意】：為了支援 GPU，請在 ./configure 階段使用 cuda=Y 啟用 CUDA。

TensorFlow 1.x

如需從 master 分支構建 TensorFlow 1.x，請使用 bazel build --config=v1 建立 TensorFlow 1.x 軟體包。

bazel build --config=v1 //tensorflow/tools/pip_package:build_pip_package

僅支援 CPU

使用 bazel 構建僅支援 CPU 的 TensorFlow 軟體包構建器：

bazel build --config=opt //tensorflow/tools/pip_package:build_pip_package

GPU 支援

要構建支援 GPU 的 TensorFlow 軟體包編譯器，請執行以下命令：

bazel build --config=opt --config=cuda //tensorflow/tools/pip_package:build_pip_package

Bazel 構建選項：

請參閱 Bazel 命令列參考文件，詳細瞭解構建選選項。

從原始碼構建 TensorFlow 會消耗大量 RAM。如果您的系統受記憶體限制，請將 Bazel 的 RAM 使用量限制為：--local_ram_resources=2048。

官方 TensorFlow 軟體包是使用符合 manylinux2010 軟體包標準的 GCC 7.3 工具鏈構建的。

對於 GCC 5 及更高版本，可以使用 --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0" 進行構建，以與舊版 ABI 相容。相容 ABI 可確保針對官方 TensorFlow pip 軟體包構建的自定義操作繼續支援使用 GCC 5 構建的軟體包。

構建軟體包

bazel build 命令會建立一個名為 build_pip_package 的可執行檔案，此檔案是用於構建 pip 軟體包的程式。如下所示地執行該可執行檔案，以在 /tmp/tensorflow_pkg 目錄中構建 .whl 軟體包。

如需從某個版本分支構建，請使用如下目錄：

./bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg

如需從 master 分支構建，請使用 --nightly_flag 獲取合適的依賴項：

./bazel-bin/tensorflow/tools/pip_package/build_pip_package --nightly_flag /tmp/tensorflow_pkg

儘管可以在同一個原始碼樹下構建 CUDA 和非 CUDA 配置，但建議在同一個原始碼樹中的這兩種配置之間切換時執行 bazel clean。

安裝軟體包

生成的 .whl 檔案的檔名取決於 TensorFlow 版本和您的平臺。例如，使用 pip3 install 安裝軟體包：

pip3 install /tmp/tensorflow_pkg/tensorflow-<version>-<tags>.whl

4、測試TensorFlow-GPU

檢視可用GPU

安裝完成後，可以使用下述程式碼檢視當前可用的GPU：

from tensorflow.python.client import device_lib

def get_available_gpus():
  local_device_protos = device_lib.list_local_devices()
  return [x.name for x in local_device_protos if x.device_type == 'GPU']

print(get_available_gpus())

出現下述類似結果，說明成功編譯並安裝，可以正常使用tensorflow-gpu版本：

2020-11-16 19:15:24.036370: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
2020-11-16 19:15:25.867856: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set
2020-11-16 19:15:25.868498: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcuda.so.1
2020-11-16 19:15:27.721411: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1724] Found device 0 with properties: 
pciBusID: 0000:17:00.0 name: GeForce RTX 3080 computeCapability: 8.6
coreClock: 1.71GHz coreCount: 68 deviceMemorySize: 9.78GiB deviceMemoryBandwidth: 707.88GiB/s
2020-11-16 19:15:27.721909: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1724] Found device 1 with properties: 
pciBusID: 0000:65:00.0 name: GeForce RTX 3080 computeCapability: 8.6
coreClock: 1.71GHz coreCount: 68 deviceMemorySize: 9.78GiB deviceMemoryBandwidth: 707.88GiB/s
# ...省略
2020-11-16 19:15:28.330945: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1271]      0 1 
2020-11-16 19:15:28.330950: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1284] 0:   N N 
2020-11-16 19:15:28.330953: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1284] 1:   N N 
2020-11-16 19:15:28.332812: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1410] Created TensorFlow device (/device:GPU:0 with 9071 MB memory) -> physical GPU (device: 0, name: GeForce RTX 3080, pci bus id: 0000:17:00.0, compute capability: 8.6)
2020-11-16 19:15:28.334046: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1410] Created TensorFlow device (/device:GPU:1 with 9068 MB memory) -> physical GPU (device: 1, name: GeForce RTX 3080, pci bus id: 0000:65:00.0, compute capability: 8.6)
['/device:GPU:0', '/device:GPU:1']

使用GPU加速

如果已經安裝配置好GPU版本的tensorflow，那麼執行模型的時候會自動選擇可用的GPU，os.environ["CUDA_VISIBLE_DEVICES"]來選擇我們要使用的GPU：

import tensorflow as tf
import os
import keras as layers
# 選擇編號為0的GPU
os.environ["CUDA_VISIBLE_DEVICES"] = "0"
# 建立模型
model = tf.keras.Sequential()
model.add(layers.Dense(16, activation='relu', input_shape=(10,)))
model.add(layers.Dense(1, activation='sigmoid'))
# 設定目標函式和學習率
optimizer = tf.keras.optimizers.SGD(0.2)
# 編譯模型
model.compile(loss='binary_crossentropy', optimizer=optimizer)
# 輸出模型概況
model.summary()

上述程式碼中，使用了編號為“0”的GPU，執行完上述程式碼後，可以使用nvidia-smi來檢視GPU的佔用情況：

在這裡插入圖片描述

如果我們希望使用多塊GPU，例如同時使用“0”、“1”兩塊GPU，可以設定“os.environ[“CUDA_VISIBLE_DEVICES”] = “0,1””，除此之外我們還可以使用TensorFlow為“tf.keras”提供的分散式訓練策略tf.distribute.MirroredStrategy來實現單機環境下的多GPU訓練：

import tensorflow as tf
from tensorflow.keras import layers

strategy = tf.distribute.MirroredStrategy()

# 優化器以及模型的構建和編譯必須巢狀在“scope()”中
with strategy.scope():
 model = tf.keras.Sequential()
 model.add(layers.Dense(16, activation='relu', input_shape=(10,)))
 model.add(layers.Dense(1, activation='sigmoid'))

 optimizer = tf.keras.optimizers.SGD(0.2)
 model.compile(loss='binary_crossentropy', optimizer=optimizer)

model.summary()

執行MNIST樣例

import os
# os.environ["CUDA_VISIBLE_DEVICES"] = "-1"

import tensorflow as tf

mnist = tf.keras.datasets.mnist

(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10, activation='softmax')
])

model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])
model.fit(x_train, y_train, epochs=5)
model.evaluate(x_test,  y_test, verbose=2)