【tf.keras】tf.keras載入AlexNet預訓練模型

wuliytTaotao發表於2019-05-29

原文網址 : https://www.cnblogs.com/wuliytTaotao/p/10942877.html

tf.keras 的預訓練模型都放在了'tensorflow.python.keras.applications' 目錄下，在 tensorflow 1.10 版本中，預訓練好的模型有：

DenseNet121, DenseNet169, DenseNet201, InceptionResNetV2, InceptionV3, MobileNet, NASNetLarge, NASNetMobile, ResNet50, VGG16, VGG19, Xception.

找了半天，發現 keras 沒有預訓練好的 AlexNet。。。

所以本文提供一種從其它框架（如 PyTorch）匯入預訓練模型的方法，下面以 AlexNet 為例。

從 PyTorch 中匯出模型引數

首先明白一點，當模型的結構一樣時，我們只需要匯入模型的引數即可復現模型，所以我們要做的就是從 PyTorch 中匯出預訓練好的模型引數，並用 keras 載入。

這裡要介紹一個微軟的專案：MMdnn。MMdnn 使我們可以在不同深度學習框架之間轉換模型，這裡我也使用 MMdnn 來轉換 AlexNet（PyTorch to Keras）。

第 0 步：配置環境

必須一致配置：
- PyTorch: 0.4.0 （如果其它版本出現了問題，請退回到 0.4.0 版）

非必須一致配置：
- numpy: 1.14.5

第 1 步：安裝 MMdnn

$ pip3 install mmdnn

其它安裝方式請參考 github。

第 2 步：得到 PyTorch 儲存完整結構和引數的模型（pth 檔案）

PyTorch 儲存模型時，可以儲存整個模型，也可以僅儲存模型的引數，都是存放到 pth 檔案中。

mmdnn 操作的 pth 檔案是要求含有模型結構的，具體參見 FAQ，而在 PyTorch 中預訓練 AlexNet 僅儲存了引數。

通過以下程式得到包含有模型結構和權重的 AlexNet 預訓練模型（pth 檔案）：

import torchvision

m = torchvision.models.alexnet(pretrained=True)                    
torch.save(m, './alexnet.pth')

對於其它模型，如 resnet101，可以通過以下指令直接得到含有結構和權重的預訓練模型：

$ mmdownload -f pytorch -n resnet101 -o ./

（不要通過上述指令得到 alexnet.pth，因為其僅僅包含權重，而不含結構，故後面一步會出現錯誤 "AttributeError: 'collections.OrderedDict' object has no attribute 'state_dict'"。）

第 3 步：匯出 PyTorch 模型的引數，儲存至 hdf5 檔案

依次執行以下三條指令，最後會得到一個 'keras_alexnet.h5' 檔案，這就是我們想要的 keras 能載入的預訓練權重檔案。

$ mmtoir -f pytorch -d alexnet --inputShape 3,227,227 -n alexnet.pth
IR network structure is saved as [alexnet.json].
IR network structure is saved as [alexnet.pb].
IR weights are saved as [alexnet.npy].
$ mmtocode -f keras --IRModelPath alexnet.pb --IRWeightPath alexnet.npy --dstModelPath keras_alexnet.py
Using TensorFlow backend.
Parse file [alexnet.pb] with binary format successfully.
Target network code snippet is saved as [keras_alexnet.py].
$ python3 -m mmdnn.conversion.examples.keras.imagenet_test -n keras_alexnet.py -w alexnet.npy --dump keras_alexnet.h5
Using TensorFlow backend.
Keras model file is saved as [keras_alexnet.h5], generated by [keras_alexnet.py.py] and [alexnet.npy].

可能遇到的問題

AttributeError: 'Conv2d' object has no attribute 'padding_mode'

Solution：PyTorch 版本問題，1.1.0 版會出現這個問題，回退到 0.4.0 版本即可。

$ pip install -i https://pypi.tuna.tsinghua.edu.cn/simple --upgrade torch==0.4.0 torchvision==0.2.0

ValueError: Object arrays cannot be loaded when allow_pickle=False

Solution：請更改 numpy 版本。

AttributeError: 'collections.OrderedDict' object has no attribute 'state_dict'

Solution：pth 檔案僅含模型引數而不含模型結構，在 PyTorch 中載入一下然後儲存含有模型結構和引數的 pth 檔案。

驗證從 PyTorch 匯出的 AlexNet 預訓練模型

測試用的幾張圖片、程式碼以及生成的 keras_alexnet.h5 檔案都存放到了雲盤：(連結:https://pan.baidu.com/s/1TCbSHn5DC7pPIk-0dnbmgg 密碼:8njp)。

import torch
import torchvision
import cv2
import numpy as np

from torch.autograd import Variable

import tensorflow as tf
from tensorflow.keras import layers,regularizers


filename_test = 'data/dog2.png'

img = cv2.imread(filename_test)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

# 資料預處理
img = cv2.resize(img, (227, 227))
img = img / 255.0
img = np.reshape(img, (1, 227, 227, 3))
# 標準化，這是 PyTorch 預訓練 AlexNet 模型的預處理方式，詳情請見 https://pytorch.org/docs/stable/torchvision/models.html
mean = np.array([0.485, 0.456, 0.406]).reshape([1, 1, 1, 3])
std = np.array([0.229, 0.224, 0.225]).reshape([1, 1, 1, 3])
img = (img - mean) / std

# PyTorch
# PyTorch 資料輸入 channel 排列和 Keras 不一致
img_tmp = np.transpose(img, (0, 3, 1, 2))

model = torchvision.models.alexnet(pretrained=True)

# torch.save(model, './model/alexnet.pth')
model = model.double()
model.eval()

y = model(Variable(torch.tensor(img_tmp)))
# 預測的類別
print(np.argmax(y.detach().numpy()))


# Keras
def get_AlexNet(num_classes=1000, drop_rate=0.5, regularizer_rate=0.01):
    """
    PyTorch 中實現的 AlexNet 預訓練模型結構，filter 的深度分別為：（64，192，384，256，256）。
    返回 AlexNet 的 inputs 和 outputs
    """
    inputs = layers.Input(shape=[227, 227, 3])

    conv1 = layers.Conv2D(64, (11, 11), strides=(4, 4), padding='valid', activation='relu')(inputs)

    pool1 = layers.MaxPooling2D(pool_size=(3, 3), strides=(2, 2))(conv1)

    conv2 = layers.Conv2D(192, (5, 5), strides=(1, 1), padding='same', activation='relu')(pool1)

    pool2 = layers.MaxPooling2D(pool_size=(3, 3), strides=(2, 2))(conv2)

    conv3 = layers.Conv2D(384, (3, 3), strides=(1, 1), padding='same', activation='relu')(pool2)

    conv4 = layers.Conv2D(256, (3, 3), strides=(1, 1), padding='same', activation='relu')(conv3)

    conv5 = layers.Conv2D(256, (3, 3), strides=(1, 1), padding='same', activation='relu')(conv4)

    pool3 = layers.MaxPooling2D(pool_size=(3, 3), strides=(2, 2))(conv5)

    flat = layers.Flatten()(pool3)

    dense1 = layers.Dense(4096, activation='relu', kernel_regularizer=regularizers.l2(regularizer_rate))(flat)
    dense1 = layers.Dropout(drop_rate)(dense1)
    dense2 = layers.Dense(4096, activation='relu', kernel_regularizer=regularizers.l2(regularizer_rate))(dense1)
    dense2 = layers.Dropout(drop_rate)(dense2)
    outputs = layers.Dense(num_classes, activation='softmax', kernel_regularizer=regularizers.l2(regularizer_rate))(dense2)

    return inputs, outputs

inputs, outputs = get_AlexNet()
model2 = tf.keras.Model(inputs, outputs)
model2.load_weights('./keras_alexnet.h5')
# 預測的類別
print(np.argmax(model2.predict(img)))

預測結果代表的類別請看部落格 ImageNet影像庫1000個類別名稱（中文註釋不斷更新）。

Attentions

PyTorch 中的預訓練 AlexNet 模型卷積層 filter 的個數和原論文不一致，filter 的個數分別 \(64，192，384，256，256\)。具體參見 GitHub - pytorch: vision/torchvision/models/alexnet.py

PyTorch 給出的解釋是，它的預訓練 AlexNet 模型用的是論文 Krizhevsky, A. (2014). One weird trick for parallelizing convolutional neural networks. arXiv preprint arXiv:1404.5997. 給出的架構，但 PyTorch 的模型架構和這篇論文還是有區別，這篇論文中第四個卷積層 filter 個數為 384，而 PyTorch 為 256。

而 caffe 中實現的 AlexNet 含有原始的 LRN 層，去掉 LRN 層後，個人感覺預訓練的權重就不能直接拿來用了。

References

GitHub - microsoft/MMdnn
ImageNet影像庫1000個類別名稱（中文註釋不斷更新）-- 徐小妹

tf.keras: 儲存與載入模型
2020-12-14
Keras模型
【tf.keras】tf.keras使用tensorflow中定義的optimizer
2019-06-06
Keras
模型訓練：資料預處理和預載入
2020-10-27
模型
訓練模型的儲存與載入
2019-12-19
模型
PyTorch預訓練Bert模型
2020-11-17
PyTorch模型
TensorFlow 2.0中的tf.keras和Keras有何區別？為什麼以後一定要用tf.keras？
2019-12-09
Keras
預訓練模型 & Fine-tuning
2020-10-18
模型
【AI】Pytorch_預訓練模型
2021-08-26
AIPyTorch模型
【預訓練語言模型】使用Transformers庫進行BERT預訓練
2024-03-13
模型ORM
自訓練 + 預訓練 = 更好的自然語言理解模型
2020-11-13
模型
pytorch-模型儲存與載入自己訓練的模型詳解
2020-10-31
PyTorch模型
【預訓練語言模型】使用Transformers庫進行GPT2預訓練
2024-03-13
模型ORMGPT
MxNet預訓練模型到Pytorch模型的轉換
2018-06-28
模型PyTorch
TorchVision 預訓練模型進行推斷
2021-02-26
模型
tf.keras實現邏輯迴歸和softmax多分類
2024-05-31
Keras邏輯迴歸
CV+NLP，使用tf.Keras構建影像搜尋引擎
2019-11-20
Keras
Findings | 中文預訓練語言模型回顧
2020-11-18
模型
常見預訓練語言模型簡述
2022-04-11
模型
知識增強的預訓練語言模型系列之ERNIE：如何為預訓練語言模型注入知識
2021-12-30
模型
通用模型、全新框架，WavLM語音預訓練模型全解
2021-12-27
模型框架
PyTorch儲存模型斷點以及載入斷點繼續訓練
2023-04-27
PyTorch模型斷點
keras 手動搭建alexnet並訓練mnist資料集
2020-11-27
Keras
tf.keras實現線性迴歸和多層感知器
2024-05-31
Keras
新型大語言模型的預訓練與後訓練正規化，阿里Qwen
2024-11-27
模型阿里
模型關聯--預載入
2019-06-26
模型
TensorFlow 呼叫預訓練好的模型—— Python 實現
2018-10-10
模型Python
keras中VGG19預訓練模型的使用
2018-07-24
Keras模型
預訓練語言模型：還能走多遠？
2020-11-13
模型
NLP與深度學習（五）BERT預訓練模型
2021-09-30
深度學習模型
大規模表格預訓練模型 SPACE-T
2023-04-18
模型
Mxnet速查_CPU和GPU的mnist預測訓練_模型匯出_模型匯入再預測_匯出onnx並預測
2022-04-02
GPU模型
Keras速查_CPU和GPU的mnist預測訓練_模型匯出_模型匯入再預測_匯出onnx並預測
2022-03-20
KerasGPU模型
新型大語言模型的預訓練與後訓練正規化，谷歌的Gemma 2語言模型
2024-11-29
模型谷歌Gemma
新型大語言模型的預訓練與後訓練正規化，Meta的Llama 3.1語言模型
2024-11-30
模型
NLP領域預訓練模型的現狀及分析
2019-11-05
模型
「NLP」GPT：第一個引入Transformer的預訓練模型
2019-08-27
GPTORM模型
COLING 2020 | CharBERT：字元敏感的預訓練語言模型
2020-11-20
字元模型
180G！中文ELECTRA預訓練模型再升級
2020-10-26
模型