【PyTorch】n卡驅動、CUDA Toolkit、cuDNN全解安裝教程

UnderTurrets發表於2024-08-25

@

目錄
  • GPU、NVIDIA Graphics Drivers、CUDA、CUDA Toolkit和cuDNN的關係
  • 使用情形判斷
    • 僅僅使用PyTorch
    • 使用torch的第三方子模組
  • 安裝NVIDIA Graphics Drivers(可跳過)
    • 前言
    • Linux
      • 法一:圖形化介面安裝(推薦)
      • 法二:手動下載檔案後命令列安裝(不推薦)
    • windows
      • 法一:GeForce Experience自動安裝
      • 法二:手動安裝
    • 檢驗安裝
  • 安裝CUDA Toolkit
    • 檢視顯示卡驅動版本情況
    • Linux
    • Windows
    • 檢驗安裝
    • 版本切換
      • Linux
      • Windows
    • Linux解除安裝CUDA Toolkit
  • 安裝PyTorch
    • 檢視顯示卡驅動的CUDA支援版本情況
    • 下載pytorch
  • 安裝cuDNN
    • Linux
      • 法一:下載tar壓縮包解壓(推薦)
      • 法二:下載deb包安裝(不推薦)
    • Windows
    • 檢驗安裝


GPU、NVIDIA Graphics Drivers、CUDA、CUDA Toolkit和cuDNN的關係

  • GPU:物理顯示卡。
  • NVIDIA Graphics Drivers:物理顯示卡驅動。
  • CUDA:一種由NVIDIA推出的通用平行計算架構,是一種平行計算平臺和程式設計模型,該架構使GPU能夠解決複雜的計算問題。在安裝NVIDIA Graphics Drivers時,CUDA已經捆綁安裝,無需另外安裝
  • CUDA Toolkit:包含了CUDA的runtime API、CUDA程式碼的編譯器nvcc(CUDA也有自己的語言,程式碼需要編譯才能執行)和debug工具等。簡單言之,可以將CUDA Toolkit視為開發CUDA程式的工具包。需要自己下載安裝。此外,在安裝CUDA Toolkit時,還可以選擇是否捆綁安裝NVIDIA Graphics Drivers顯示卡驅動,因此就可以簡略我們的步驟。
  • cuDNN:基於CUDA Toolkit,專門針對深度神經網路中的基礎操作而設計基於GPU的加速庫。需要自己下載安裝,其實所謂的安裝就是移動幾個庫檔案到指定路徑。

使用情形判斷

僅僅使用PyTorch

在只使用torch的情況下,不需要安裝CUDA Toolkit和cuDNN,只需要顯示卡驅動,conda或者pip會為我們安排好一切

安裝順序應該是:NVIDIA Graphics Drivers->PyTorch

使用torch的第三方子模組

需要安裝CUDA Toolkit

在安裝一些基於torch的第三方子模組時,譬如tiny-cuda-nnnvdiffrastsimple-knn。如果沒有安裝CUDA Toolkit,torch/utils/cpp_extension.py會報錯如下:

File "....../torch/utils/cpp_extension.py", line 1076, in CUDAExtension
	library_dirs += library_paths(cuda=True)
File "....../torch/utils/cpp_extension.py", line 1203, in library_paths
	if (not os.path.exists(_join_cuda_home(lib_dir)) and
File "....../torch/utils/cpp_extension.py", line 2416, in _join_cuda_home
	raise OSError('CUDA_HOME environment variable is not set. '
OSError: CUDA_HOME environment variable is not set. Please set it to your CUDA install root.

這個報錯的意思是找不到CUDA的環境變數路徑。這個環境變數是隻有安裝了CUDA Toolkit之後才會設定的。

這個報錯在僅僅使用pytorch時沒有影響,因為pytorch在安裝時已經準備好了一切,不需要CUDA環境變數。但是,我們現在需要安裝其他子模組,就必須要解決這個問題了。

對於做深度學習的研究者,使用其他子模組是經常會碰到的,因此,筆者建議直接安裝CUDA Toolkit,在安裝CUDA Toolkit的時候捆綁安裝顯示卡驅動

因此,安裝順序應該是:NVIDIA Graphics Drivers(可跳過,在安裝CUDA Toolkit的時候捆綁安裝)->CUDA Toolkit->PyTorch->cuDNN

安裝NVIDIA Graphics Drivers(可跳過)

前言

在安裝CUDA Toolkit的時候可以選擇捆綁安裝NVIDIA Graphics Drivers顯示卡驅動。因此,這一步完全可以跳過,但筆者依舊先寫出來。

Linux

法一:圖形化介面安裝(推薦)

在這裡插入圖片描述

換好源之後更新升級。必須要升級。否則,安裝的n卡驅動是無法生效的!而且,下次重啟進入Linux之後,連圖形化介面都不會出現!!

sudo apt update
sudo apt upgrade

安裝必要依賴。必須要安裝gccg++cmake。否則,安裝的n卡驅動是無法生效的!而且,下次重啟進入Linux之後,連圖形化介面都不會出現!!

sudo apt install gcc cmake
sudo apt install g++

然後直接下載安裝即可:

在這裡插入圖片描述

法二:手動下載檔案後命令列安裝(不推薦)

筆者沒有使用過此方法,也不推薦此方法。在能用圖形化介面的情況下就不要多此一舉了。

windows

法一:GeForce Experience自動安裝

NVIDIA官網下載GeForce Experience,安裝好GeForce Experience之後可以在這個應用裡面直接下載最新的驅動。

法二:手動安裝

同樣的頁面手動搜尋對應型號的顯示卡驅動,下載安裝。

在這裡插入圖片描述

檢驗安裝

nvidia-smi

如果有類似下面的輸出,那麼NVIDIA Graphics Drivers就已經安裝了:

Sat Jan 27 14:35:37 2024   
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.147.05   Driver Version: 525.147.05   CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0  On |                  N/A |
| N/A   35C    P5    23W / 115W |   1320MiB /  8192MiB |     13%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                           
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      3719      G   /usr/lib/xorg/Xorg                489MiB |
|    0   N/A  N/A      3889      G   /usr/bin/gnome-shell               53MiB |
|    0   N/A  N/A      4218    C+G   fantascene-dynamic-wallpaper      406MiB |
|    0   N/A  N/A      8052      G   gnome-control-center                2MiB |
|    0   N/A  N/A      8397      G   ...--variations-seed-version      282MiB |
|    0   N/A  N/A     13242      G   ...RendererForSitePerProcess       59MiB |
|    0   N/A  N/A     47273      G   ...--variations-seed-version       18MiB |
+-----------------------------------------------------------------------------+

安裝CUDA Toolkit

檢視顯示卡驅動版本情況

CUDA Toolkit對剛剛安裝的顯示卡驅動有版本要求,具體可以去此處查詢。2024.1查詢的關係如下:

在這裡插入圖片描述

如果你跳過了安裝顯示卡驅動的步驟,那麼你就下載一個最新的CUDA Toolkit好了,它會捆綁安裝適配的顯示卡驅動。

如果你已經安裝了顯示卡驅動,那麼可以鍵入如下指令查詢自己的顯示卡驅動版本:

nvidia-smi

可以在下面看到我的版本是525.147.05

Sat Jan 27 14:35:37 2024   
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.147.05   Driver Version: 525.147.05   CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0  On |                  N/A |
| N/A   35C    P5    23W / 115W |   1320MiB /  8192MiB |     13%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                           
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      3719      G   /usr/lib/xorg/Xorg                489MiB |
|    0   N/A  N/A      3889      G   /usr/bin/gnome-shell               53MiB |
|    0   N/A  N/A      4218    C+G   fantascene-dynamic-wallpaper      406MiB |
|    0   N/A  N/A      8052      G   gnome-control-center                2MiB |
|    0   N/A  N/A      8397      G   ...--variations-seed-version      282MiB |
|    0   N/A  N/A     13242      G   ...RendererForSitePerProcess       59MiB |
|    0   N/A  N/A     47273      G   ...--variations-seed-version       18MiB |
+-----------------------------------------------------------------------------+

Linux

進入官網選擇合適的版本。然後根據自己的裝置一步步選擇安裝即可。

完成選擇之後,應該有類似介面。根據官網的步驟一步一步來即可。

在這裡插入圖片描述

根據官網步驟,可以看到給出了三種安裝方式。筆者在安裝的時候先嚐試了第二種,官網步驟如下:

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.0-1_all.deb
sudo dpkg -i cuda-keyring_1.0-1_all.deb
sudo apt-get update
sudo apt-get -y install cuda

然後在第三步報錯如下:

han@ASUS-TUF-Gaming-F15-FX507ZR:~$ sudo apt-get -y install cuda
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:

The following packages have unmet dependencies:
 libnvidia-common-525 : Conflicts: libnvidia-common
 libnvidia-common-545 : Conflicts: libnvidia-common
 nvidia-kernel-common-525 : Conflicts: nvidia-kernel-common
 nvidia-kernel-common-545 : Conflicts: nvidia-kernel-common
E: Error, pkgProblemResolver::Resolve generated breaks, this may be caused by held packages.

根據提示資訊,是因為我現在已經有libnvidia-common-525nvidia-kernel-common-525,無法再安裝libnvidia-commonnvidia-kernel-common,需要更換現有的軟體包。理論上,這個問題有兩個解決方案:

  1. 替換軟體包
sudo apt-get remove libnvidia-common-525  nvidia-kernel-common-525
sudo apt-get install libnvidia-common nvidia-kernel-common
  1. 放棄apt,使用aptitude安裝
sudo aptitude install cuda

筆者這裡都沒有嘗試,而是改成了官網的另外一種安裝方式:

wget https://developer.download.nvidia.com/compute/cuda/11.8.0/local_installers/cuda_11.8.0_520.61.05_linux.run
sudo sh cuda_11.8.0_520.61.05_linux.run

執行這個run程式之後會解壓一段時間,要有一些耐心,然後就會有安裝引導,一路yes之後來到這裡:

在這裡插入圖片描述

  • 注意點1:如果跳過了顯示卡驅動安裝的,這裡就勾選第一個Driver。筆者已經安裝了顯示卡驅動,自然就不用再勾選Driver了。然後安裝即可。
  • 注意點2:如果這裡勾選Kernel Objects,會導致安裝不成功。筆者暫時不清楚原因,可能是因為已經安裝了顯示卡驅動的原因。總之,這裡不要勾選Kernel Objects

筆者在選擇Install之後的安裝過程中還出現dkms未安裝報錯,於是sudo apt install dkms,再次嘗試安裝,就成功了,然後出現:

===========
= Summary =
===========

Driver:   Not Selected
Toolkit:  Installed in /usr/local/cuda-11.8/

Please make sure that
 -   PATH includes /usr/local/cuda-11.8/bin
 -   LD_LIBRARY_PATH includes /usr/local/cuda-11.8/lib64, or, add /usr/local/cuda-11.8/lib64 to /etc/ld.so.conf and run ldconfig as root

To uninstall the CUDA Toolkit, run cuda-uninstaller in /usr/local/cuda-11.8/bin
***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 520.00 is required for CUDA 11.8 functionality to work.
To install the driver using this installer, run the following command, replacing <CudaInstaller> with the name of this run file:
    sudo <CudaInstaller>.run --silent --driver

Logfile is /var/log/cuda-installer.log

根據提示,我們新增環境變數:

echo "export LD_LIBRARY_PATH="/usr/local/cuda/lib64:\$LD_LIBRARY_PATH"" >> ~/.bashrc 
echo "export PATH="/usr/local/cuda/bin:\$PATH"" >> ~/.bashrc 

然後安裝就完成了。

Windows

win系統下比較簡單,進入官網選擇合適的版本,直接下載exe可執行程式,進入引導安裝即可。

  • 注意點:同樣根據自己是否安裝過顯示卡驅動來勾選要不要裝顯示卡驅動

環境變數會自動設定好,不需要手動設定。安裝程式會自動新增以下CUDA_PATH_V11_8CUDA_PATH這2個環境變數:

CUDA_PATH_V11_8=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8
CUDA_PATH=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8

安裝程式還會自動在Path環境變數中新增以下2項:

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8\bin
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8\libnvvp

檢驗安裝

重新開啟一個終端檢視:

nvcc -V

應該有如下類似的輸出:

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:33:58_PDT_2022
Cuda compilation tools, release 11.8, V11.8.89
Build cuda_11.8.r11.8/compiler.31833905_0

版本切換

Linux

cuda的軟連結位於/usr/local/,輸入如下命令檢視:

ls -l /usr/local/

應該有類似如下的輸出:

han@ASUS-TUF-Gaming-F15-FX507ZR:~$ ls -l /usr/local/
total 40
lrwxrwxrwx  1 root root   21  1月 27 16:43 cuda -> /usr/local/cuda-11.8/
drwxr-xr-x 17 root root 4096  1月 27 16:44 cuda-11.8
drwxr-xr-x  2 root root 4096  8月  9  2022 etc
drwxr-xr-x  2 root root 4096  8月  9  2022 games
drwxr-xr-x  2 root root 4096  8月  9  2022 include
drwxr-xr-x  2 root root 4096  1月 27 16:38 kernelobjects
drwxr-xr-x  3 root root 4096  1月 22 15:26 lib
lrwxrwxrwx  1 root root    9  1月 22 14:10 man -> share/man
drwxr-xr-x  3 root root 4096  1月 23 21:52 Qt-5.6.3
drwxr-xr-x  2 root root 4096  8月  9  2022 sbin
drwxr-xr-x  8 root root 4096  1月 23 22:09 share
drwxr-xr-x  2 root root 4096  8月  9  2022 src

可以看到現在的cuda是連結到了我剛剛安裝的cuda-11.8.一臺裝置可以安裝不同的CUDA Toolkit版本,想要切換版本,只需要改變這個軟連結即可。

假如我還有一個CUDA Toolkit12.0,可以用如下指令切換:

ls -snf /usr/local/cuda-12.0/ /usr/local/cuda

Windows

修改之前自動新增的CUDA_PATH環境變數到對應目錄:

CUDA_PATH=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.0

修改之前自動新增到Path環境變數下的那兩個專案:

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.0\bin
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.0\libnvvp

Linux解除安裝CUDA Toolkit

再次回顧安裝完成後的summary:

===========
= Summary =
===========

Driver:   Not Selected
Toolkit:  Installed in /usr/local/cuda-11.8/

Please make sure that
 -   PATH includes /usr/local/cuda-11.8/bin
 -   LD_LIBRARY_PATH includes /usr/local/cuda-11.8/lib64, or, add /usr/local/cuda-11.8/lib64 to /etc/ld.so.conf and run ldconfig as root

To uninstall the CUDA Toolkit, run cuda-uninstaller in /usr/local/cuda-11.8/bin
***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 520.00 is required for CUDA 11.8 functionality to work.
To install the driver using this installer, run the following command, replacing <CudaInstaller> with the name of this run file:
    sudo <CudaInstaller>.run --silent --driver

Logfile is /var/log/cuda-installer.log

可以知道只需要輸入類似的指令:(自行更改版本號)

sudo /usr/local/cuda-11.8/bin/cuda-uninstaller

安裝PyTorch

檢視顯示卡驅動的CUDA支援版本情況

同樣是這個指令:

nvidia-smi

可以在下面看到我的最大支援的CUDA版本是12.0

Sat Jan 27 14:35:37 2024   
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.147.05   Driver Version: 525.147.05   CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0  On |                  N/A |
| N/A   35C    P5    23W / 115W |   1320MiB /  8192MiB |     13%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                           
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      3719      G   /usr/lib/xorg/Xorg                489MiB |
|    0   N/A  N/A      3889      G   /usr/bin/gnome-shell               53MiB |
|    0   N/A  N/A      4218    C+G   fantascene-dynamic-wallpaper      406MiB |
|    0   N/A  N/A      8052      G   gnome-control-center                2MiB |
|    0   N/A  N/A      8397      G   ...--variations-seed-version      282MiB |
|    0   N/A  N/A     13242      G   ...RendererForSitePerProcess       59MiB |
|    0   N/A  N/A     47273      G   ...--variations-seed-version       18MiB |
+-----------------------------------------------------------------------------+

下載pytorch

開啟pytorch的官網,輸入對應自己裝置環境的pytorch安裝指令即可。著重注意剛剛安裝的顯示卡驅動的可支援CUDA的最高版本,我們選擇的pytorch的CUDA版本要低於顯示卡驅動的可支援CUDA的最高版本。例如,我剛剛查詢到我的顯示卡驅動可支援CUDA最高版本為12.0,那麼我就選擇11.8,如下圖這樣:

在這裡插入圖片描述

當然也可以選擇conda安裝。

安裝cuDNN

cuDNN對已經安裝的CUDA版本有要求。進入官網,選擇合適的版本,介面如下:

在這裡插入圖片描述

下載即可。安裝的官方文件在這裡

Linux

按官方文件,先安裝依賴:

sudo apt-get install zlib1g

法一:下載tar壓縮包解壓(推薦)

下載好之後解壓縮:

tar -xvf cudnn-linux-*-archive.tar.xz

然後複製檔案並賦予許可權就完成了:

sudo cp cudnn-*-archive/include/cudnn*.h /usr/local/cuda/include 
sudo cp -P cudnn-*-archive/lib/libcudnn* /usr/local/cuda/lib64 
sudo chmod a+r /usr/local/cuda/include/cudnn*.h /usr/local/cuda/lib64/libcudnn*

法二:下載deb包安裝(不推薦)

deb包安裝反而要複雜一些。

  1. 下載好之後dpkg安裝一下:
sudo dpkg -i cudnn-local-repo-*.deb
  1. 匯入GPG key並更新:
sudo apt-get install libcudnn8=x.x.x.x-1+cudaX.Y

這裡的xy根據自己的版本自己調整

  1. 再安裝幾個依賴:
sudo apt-get install libcudnn8=x.x.x.x-1+cudaX.Y
sudo apt-get install libcudnn8-dev=x.x.x.x-1+cudaX.Y
sudo apt-get install libcudnn8-samples=x.x.x.x-1+cudaX.Y

這裡的xy也自己的版本略微調整

Windows

Windows只有解壓縮安裝的方式。下載並解壓縮好zip檔案,然後複製庫檔案如下:

  1. 複製bin\cudnn*.dllC:\Program Files\NVIDIA\CUDNN\v8.x\bin
  2. 複製include\cudnn*.hC:\Program Files\NVIDIA\CUDNN\v8.x\include
  3. 複製lib\cudnn*.libC:\Program Files\NVIDIA\CUDNN\v8.x\lib

然後修改PATH環境變數,在其中新增一個專案:

C:\Program Files\NVIDIA\CUDNN\v8.x\bin

檢驗安裝

執行/usr/local/cuda/extras/demo_suite/deviceQuery,應該有以下類似輸出:

./deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "NVIDIA GeForce RTX 3070 Laptop GPU"
  CUDA Driver Version / Runtime Version          12.0 / 11.8
  CUDA Capability Major/Minor version number:    8.6
  Total amount of global memory:                 7952 MBytes (8337752064 bytes)
  (40) Multiprocessors, (128) CUDA Cores/MP:     5120 CUDA Cores
  GPU Max Clock rate:                            1560 MHz (1.56 GHz)
  Memory Clock rate:                             7001 Mhz
  Memory Bus Width:                              256-bit
  L2 Cache Size:                                 4194304 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
  Maximum Layered 1D Texture Size, (num) layers  1D=(32768), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(32768, 32768), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 65536
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  1536
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 2 copy engine(s)
  Run time limit on kernels:                     Yes
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  Device supports Unified Addressing (UVA):      Yes
  Device supports Compute Preemption:            Yes
  Supports Cooperative Kernel Launch:            Yes
  Supports MultiDevice Co-op Kernel Launch:      Yes
  Device PCI Domain ID / Bus ID / location ID:   0 / 1 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 12.0, CUDA Runtime Version = 11.8, NumDevs = 1, Device0 = NVIDIA GeForce RTX 3070 Laptop GPU
Result = PASS

執行/usr/local/cuda/extras/demo_suite/bandwidthTest,應該有以下類似輸出:

[CUDA Bandwidth Test] - Starting...
Running on...

 Device 0: NVIDIA GeForce RTX 3070 Laptop GPU
 Quick Mode

 Host to Device Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes)	Bandwidth(MB/s)
   33554432			12499.4

 Device to Host Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes)	Bandwidth(MB/s)
   33554432			12843.0

 Device to Device Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes)	Bandwidth(MB/s)
   33554432			384586.5

Result = PASS

NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.

本文由部落格一文多發平臺 OpenWrite 釋出!

相關文章