inplace-abn 報錯解決: ImportError: libcudart.so.9.0: cannot open shared object file: No such file or dir

MichaelToLearn發表於2020-12-04

inplace-abn 報錯解決

報錯資訊如下:

Traceback (most recent call last):
  File "train.py", line 14, in <module>
    from unet import UNet
  File "/data3/yuechen/new/pytorch_unet/unet/__init__.py", line 1, in <module>
    from .unet_model import UNet
  File "/data3/yuechen/new/pytorch_unet/unet/unet_model.py", line 5, in <module>
    from .unet_parts import *
  File "/data3/yuechen/new/pytorch_unet/unet/unet_parts.py", line 7, in <module>
    from inplace_abn import InPlaceABN, InPlaceABNSync
  File "/data3/yuechen/software/anaconda3/envs/unet10/lib/python3.6/site-packages/inplace_abn/__init__.py", line 1, in <module>
    from .abn import ABN, InPlaceABN, InPlaceABNSync
  File "/data3/yuechen/software/anaconda3/envs/unet10/lib/python3.6/site-packages/inplace_abn/abn.py", line 8, in <module>
    from .functions import inplace_abn, inplace_abn_sync
  File "/data3/yuechen/software/anaconda3/envs/unet10/lib/python3.6/site-packages/inplace_abn/functions.py", line 8, in <module>
    from . import _backend
ImportError: libcudart.so.9.0: cannot open shared object file: No such file or directory

原因是在 209 上裝的是 10.0 的,但是這裡不知道為啥弄了一個 9.0 的過來,但是 inplace-abn 的官網上說是支援 10.0 的,不知道為啥啊。

嘗試1(失敗)

嘗試直接這樣解決:

export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/data3/yuechen/new/cuda-9.0-files/lib64"

嘗試2 (成功解決)

設定 CUDA_HOME

export CUDA_HOME=/usr/local/cuda-10.0

嘗試3(失敗)

同時設定 LD_LIBRARY_PATH為 10.0 版本

export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda-10.0/lib64/"

嘗試4(成功)

因為這個模組涉及到編譯,而這個模組安裝的時候是不會自動檢查你的 CUDA 是什麼版本的,需要在編譯前手動設定一下CUDA_HOME,如果已經安裝,則按照下面的步驟執行:

# 解除安裝這個庫
pip uninstall inplace-abn
# 清除 pip 快取
rm -r ~/.cache/pip
# 設定 CUDA_HOME
export CUDA_HOME=/usr/local/cuda-10.0
# 重新安裝
pip install inplace-abn

相關文章