磐創AI分享

來源 | OpenCV學堂

作者 | gloomyfish

Hello大家好，這篇文章給大家詳細介紹一下pytorch中最重要的元件torchvision，它包含了常見的資料集、模型架構與預訓練模型權重檔案、常見影像變換、計算機視覺任務訓練。可以是說是pytorch中非常有用的模型遷移學習神器。本文將會介紹如何使用torchvison的預訓練模型ResNet50實現影像分類。

模型

Torchvision.models包裡面包含了常見的各種基礎模型架構，主要包括：

AlexNet
VGG
ResNet
SqueezeNet
DenseNet
Inception v3
GoogLeNet
ShuffleNet v2
MobileNet v2
ResNeXt
Wide ResNet
MNASNet

這裡我選擇了ResNet50，基於ImageNet訓練的基礎網路來實現影像分類，網路模型下載與載入如下：

model = torchvision.models.resnet50(pretrained=True).eval().cuda()
tf = transforms.Compose([
transforms.Resize(256),
transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize(
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225]
)])

使用模型實現影像分類

這裡首先需要載入ImageNet的分類標籤，目的是最後顯示分類的文字標籤時候使用。然後對輸入影像完成預處理，使用ResNet50模型實現分類預測，對預測結果解析之後，顯示標籤文字，完整的程式碼演示如下：

 1with open('imagenet_classes.txt') as f:
 2    labels = [line.strip() for line in f.readlines()]
 3
 4src = cv.imread("D:/images/space_shuttle.jpg") # aeroplane.jpg
 5image = cv.resize(src, (224, 224))
 6image = np.float32(image) / 255.0
 7image[:,:,] -= (np.float32(0.485), np.float32(0.456), np.float32(0.406))
 8image[:,:,] /= (np.float32(0.229), np.float32(0.224), np.float32(0.225))
 9image = image.transpose((2, 0, 1))
10input_x = torch.from_numpy(image).unsqueeze(0)
11print(input_x.size())
12pred = model(input_x.cuda())
13pred_index = torch.argmax(pred, 1).cpu().detach().numpy()
14print(pred_index)
15print("current predict class name : %s"%labels[pred_index[0]])
16cv.putText(src, labels[pred_index[0]], (50, 50), cv.FONT_HERSHEY_SIMPLEX, 1.0, (0, 0, 255), 2)
17cv.imshow("input", src)
18cv.waitKey(0)
19cv.destroyAllWindows()

執行結果如下：

輕鬆學Pytorch-使用ResNet50實現影像分類

轉ONNX支援

在torchvision中的模型基本上都可以轉換為ONNX格式，而且被OpenCV DNN模組所支援，所以，很方便的可以對torchvision自帶的模型轉為ONNX，實現OpenCV DNN的呼叫，首先轉為ONNX模型，直接使用torch.onnx.export即可轉換(還不知道怎麼轉，快點看前面的例子)。轉換之後使用OpenCV DNN呼叫的程式碼如下：

 1with open('imagenet_classes.txt') as f:
 2    labels = [line.strip() for line in f.readlines()]
 3net = cv.dnn.readNetFromONNX("resnet.onnx")
 4src = cv.imread("D:/images/messi.jpg")  # aeroplane.jpg
 5image = cv.resize(src, (224, 224))
 6image = np.float32(image) / 255.0
 7image[:, :, ] -= (np.float32(0.485), np.float32(0.456), np.float32(0.406))
 8image[:, :, ] /= (np.float32(0.229), np.float32(0.224), np.float32(0.225))
 9blob = cv.dnn.blobFromImage(image, 1.0, (224, 224), (0, 0, 0), False)
10net.setInput(blob)
11probs = net.forward()
12index = np.argmax(probs)
13cv.putText(src, labels[index], (50, 50), cv.FONT_HERSHEY_SIMPLEX, 1.0, (0, 0, 255), 2)
14cv.imshow("input", src)
15cv.waitKey(0)
16cv.destroyAllWindows()

執行結果見上圖，這裡就不再貼了。

輕鬆學Pytorch-使用ResNet50實現影像分類

磐創AI分享

相關文章