pytorch 轉 tensorRT 踩的幾個小坑_tensorrt engine set up failed

钢之炼丹术师發表於2024-05-19

原文網址 : https://www.cnblogs.com/algorithmSpace/p/18200236

CSDN搬家失敗，手動匯出markdown後再匯入部落格園

1、版本不匹配

[E] [TRT] Layer:Where_51's output can not be used as shape tensor.
[E] [TRT] Network validation failed.
[E] Engine creation failed.
[E] Engine set up failed.

這實際是由於 pytorch 與 TensorRT 版本不匹配，我的 TensorRT 是 7.0，pytorch 應該是 1.4，但我用了 1.7

因此需要用 1.7 重新讀取權重檔案，然後用老的方式儲存，再用 onnx 匯出

def main():
    input_shape = (3, 416, 416)
    model_onnx_path = "yolov4tiny.onnx"
 
    # model = torch.hub.load('mateuszbuda/brain-segmentation-pytorch', 'unet',
    #                        in_channels=3, out_channels=1, init_features=32, pretrained=True)
    model = YoloBody(3, 12).cuda()
    device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
    dummy_input = torch.randn(1, 3, 416, 416, device=device)
    # 用1.7版本讀取權重
    state_dict = torch.load('logs/Epoch120-Total_Loss0.5324-Val_Loss0.8735.pth', map_location=device)
    model.load_state_dict(state_dict)
    # 儲存成1.4版本支援的格式
    torch.save(model.state_dict(), 'logs/for_onnx.pth', _use_new_zipfile_serialization=False)
    
    # Python直譯器換成torch1.4的環境，重新讀取，匯出pytorch1.4版本對應的onnx
    state_dict = torch.load('logs/for_onnx.pth', map_location=device)
    model.load_state_dict(state_dict)
    model.train(False)
 
    inputs = ['input_1']
    outputs = ['output_1', 'output_2']
    dynamic_axes = {'input_1': {0: 'batch'}, 'output_1': {0: 'batch'}}
    torch.onnx.export(model,
                      dummy_input,
                      model_onnx_path,
                      export_params=True,
                      opset_version=11,
                      do_constant_folding=True,
                      input_names=inputs, output_names=outputs,
                      dynamic_axes=None)

這樣操作產生的 onnx 檔案才能被 TensorRT 轉換

使用 TensorRT 的 OSS 工具：

class Upsample(nn.Module):
    def __init__(self):
        super(Upsample, self).__init__()
 
    def forward(self, x, target_size, inference=False):
        assert (x.data.dim() == 4)
        # _, _, tH, tW = target_size
 
        if inference:
 
            #B = x.data.size(0)
            #C = x.data.size(1)
            #H = x.data.size(2)
            #W = x.data.size(3)
 
            return x.view(x.size(0), x.size(1), x.size(2), 1, x.size(3), 1).\
                    expand(x.size(0), x.size(1), x.size(2), target_size[2] // x.size(2), x.size(3), target_size[3] // x.size(3)).\
                    contiguous().view(x.size(0), x.size(1), target_size[2], target_size[3])
        else:
            return F.interpolate(x, size=(target_size[2], target_size[3]), mode='nearest')

建議開啟 --verbose，轉換過程會很慢，verbose 列印日誌看著能安心點，要不然盯著螢幕會以為卡死了，發慌

2、上取樣 scale 問題

[5] Assertion failed: ctx->tensors().count(inputName)

![[output/attachments/d697b05ccbc0cd8ef1baff4f153b9024_MD5.png]]

YOLO 的上取樣階段，Pytorch 使用 opset=11 的 onnx 會導致 upsample 層裡增加一個 constant 節點，所以 TensorFlowRT 轉換失敗，期間參考 pytorch 經 onnx 使用 TensorRT 部署轉換踩坑記錄中提到的方法，無效

![[output/attachments/95e6ae1c7b9192edb8b6674198bb354e_MD5.png]]

嘗試多個版本的 Pytorch 與 onnx 後，upsample 層的問題依然解決不了，最後參考 https://github.com/Tianxiaomo/pytorch-YOLOv4 這個實現，在 inference 時不使用 torch 自己的插值函式，而是自己重寫，成功匯出 TensorRT

class Upsample(nn.Module):
    def __init__(self):
        super(Upsample, self).__init__()
 
    def forward(self, x, target_size, inference=False):
        assert (x.data.dim() == 4)
        # _, _, tH, tW = target_size
 
        if inference:
 
            #B = x.data.size(0)
            #C = x.data.size(1)
            #H = x.data.size(2)
            #W = x.data.size(3)
 
            return x.view(x.size(0), x.size(1), x.size(2), 1, x.size(3), 1).\
                    expand(x.size(0), x.size(1), x.size(2), target_size[2] // x.size(2), x.size(3), target_size[3] // x.size(3)).\
                    contiguous().view(x.size(0), x.size(1), target_size[2], target_size[3])
        else:
            return F.interpolate(x, size=(target_size[2], target_size[3]), mode='nearest')

3、資料型別錯誤

Unsupported ONNX data type: DOUBLE (2)

由於上面的改動中，出現了除法，導致 TRT 識別雙精度，下圖中出現的 cast 節點即為問題。很神奇的是用上面原始的程式碼不會出現這個問題，換了我自己的模型就有問題了

![[output/attachments/20922cff4c9b9fce378baf949d42b34f_MD5.png]]

根據 https://github.com/onnx/onnx-tensorrt/issues/400#issuecomment-730240546 的說法

![[output/attachments/960f5b584c8b980e4c9713b43a46a8a4_MD5.png]]

嘗試了下並沒什麼用。

為了解決這個問題，直接把 target_size[3] // x.size(3) 換成結果 2 即可，成功。

實現TensorRT-7.0外掛自由！(如果不踩坑使用TensorRT外掛功能)
2021-02-20
TensorRT 開始
2022-01-12
TensorRT IRNNv2Layer
2020-11-03
RNN
TensorRT基礎筆記
2023-01-10
筆記
用於ONNX的TensorRT後端
2020-12-21
後端
TensorRT c++部署onnx模型
2024-06-03
C++模型
Ubuntu18.04安裝tensorRT部署模型
2020-12-08
Ubuntu模型
使用TensorRT部署你的神經網路（1）
2020-12-29
神經網路
人臉口罩檢測(含執行程式碼+資料集)Pytorch+TensorRT+Xavier NX
2023-01-17
行程PyTorch
小白程式設計師最容易踩的“坑”，你踩過幾個？
2020-02-12
程式設計師
小程式踩坑填坑
2018-07-25
小程式踩坑
2019-04-11
【實戰】yolov8 tensorrt模型加速部署
2023-01-24
YOLO模型
面試中的這些坑，你踩過幾個？
2018-09-10
面試
給你總結幾個ES下最容易踩的坑
2020-05-31
c# onnx模型部署：TensorRT、OpenVino、ONNXRuntime、OpenCV dnn
2024-09-02
C#模型OpenCVDNN
LLM 推理 - Nvidia TensorRT-LLM 與 Triton Inference Server
2024-06-26
Server
[小程式] mpVue 踩坑
2018-07-14
Vue
小程式踩坑（2）
2018-12-29
折騰ChatGLM的幾個避坑小技巧
2023-04-11
【jetson nano】yolov5環境配置tensorrt部署加速
2024-06-02
NaNYOLO
wepy小程式踩過的坑(1)
2018-08-07
mpvue小程式踩坑之旅
2019-03-05
Vue
你踩過幾個？盤點微信H5小遊戲開發中的那些坑
2018-05-24
H5遊戲開發
這些Java8官方挖過的坑，你踩過幾個？
2020-06-01
Java
weex官方文件手冊上問題和踩過的幾個坑。
2021-09-09
TensorRT 筆記 - 在 Conda 虛擬環境中安裝
2022-02-24
筆記
又踩坑了！BigDecimal使用的5個坑！
2024-04-12
Decimal
視訊播放–踩坑小計
2018-06-09
視訊播放--踩坑小計
2018-06-07
小程式踩坑日記（一）
2018-04-05
Taro 小程式踩坑記錄
2020-11-26
敏捷轉型6大坑，你踩過哪個？
2021-07-05
敏捷
如何載入本地下載下來的BERT模型，pytorch踩坑！！
2022-01-28
模型PyTorch
用TensorRT針對AArch64使用者的交叉編譯示例
2020-11-04
編譯
Windows10下yolov8 tensorrt模型加速部署【實戰】
2023-02-03
WindowsYOLO模型
Win10下yolov8 tensorrt模型加速部署【實戰】
2023-02-04
Win10YOLO模型
Golang 需要避免踩的 50 個坑
2019-04-10
Golang

pytorch 轉 tensorRT 踩的幾個小坑_tensorrt engine set up failed

1、版本不匹配

2、上取樣 scale 問題

3、資料型別錯誤

相關文章