【tf.keras】tf.keras使用tensorflow中定義的optimizer

wuliytTaotao發表於2019-06-06

原文網址 : https://www.cnblogs.com/wuliytTaotao/p/10986952.html

我的 tensorflow+keras 版本：

print(tf.VERSION)    # '1.10.0'
print(tf.keras.__version__)    # '2.1.6-tf'

tf.keras 沒有實現 AdamW，即 Adam with Weight decay。論文《DECOUPLED WEIGHT DECAY REGULARIZATION》提出，在使用 Adam 時，weight decay 不等於 L2 regularization。具體可以參見當前訓練神經網路最快的方式：AdamW優化演算法+超級收斂或 L2正則=Weight Decay？並不是這樣。

keras 中沒有實現 AdamW 這個 optimizer，而 tensorflow 中實現了，所以在 tf.keras 中引入 tensorflow 的 optimizer 就好。

如下所示：

import tensorflow as tf
from tensorflow.contrib.opt import AdamWOptimizer

mnist = tf.keras.datasets.mnist

(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(512, activation=tf.nn.relu),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10, activation=tf.nn.softmax)
])

# adam = tf.train.AdamOptimizer()

# adam with weight decay
adamw = AdamWOptimizer(weight_decay=1e-4)

model.compile(optimizer=adamw,
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

model.fit(x_train, y_train, epochs=10, validation_split=0.1)
print(model.evaluate(x_test, y_test))

如果只是像上面這樣使用的話，已經沒問題了。但是如果要加入 tf.keras.callbacks 中的某些元素，如 tf.keras.callbacks.ReduceLROnPlateau()，可能就會出現異常 AttributeError: 'TFOptimizer' object has no attribute 'lr'。

以下程式碼將出現 AttributeError: 'TFOptimizer' object has no attribute 'lr'，就是因為加入了 tf.keras.callbacks.ReduceLROnPlateau()，其它兩個 callbacks 不會引發異常。

import tensorflow as tf
from tensorflow.contrib.opt import AdamWOptimizer

mnist = tf.keras.datasets.mnist

(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(512, activation=tf.nn.relu),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10, activation=tf.nn.softmax)
])

# 按照 val_acc 的值來儲存模型的引數，val_acc 有提升才儲存新的引數
ck_callback = tf.keras.callbacks.ModelCheckpoint('checkpoints/weights-improvement-{epoch:02d}-{val_acc:.2f}.hdf5', monitor='val_acc', mode='max',
                                                verbose=1, save_best_only=True, save_weights_only=True)
# 使用 tensorboard 監控訓練過程
tb_callback = tf.keras.callbacks.TensorBoard(log_dir='logs')
# 在 patience 個 epochs 內，被監控的 val_loss 都沒有下降，那麼就降低 learning rate，新的值為 lr = factor * lr_old
lr_callback = tf.keras.callbacks.ReduceLROnPlateau(patience=3)

adam = tf.train.AdamOptimizer()

# adam with weight decay
# adamw = AdamWOptimizer(weight_decay=1e-4)

model.compile(optimizer=adam,
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

model.fit(x_train, y_train, epochs=10, validation_split=0.1, callbacks=[ck_callback, tb_callback, lr_callback])
print(model.evaluate(x_test, y_test))

解決辦法如下所示：

import tensorflow as tf
from tensorflow.contrib.opt import AdamWOptimizer
from tensorflow.keras import backend as K
from tensorflow.python.keras.optimizers import TFOptimizer

mnist = tf.keras.datasets.mnist

(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(512, activation=tf.nn.relu),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10, activation=tf.nn.softmax)
])

# 按照 val_acc 的值來儲存模型的引數，val_acc 有提升才儲存新的引數
ck_callback = tf.keras.callbacks.ModelCheckpoint('checkpoints/weights-improvement-{epoch:02d}-{val_acc:.2f}.hdf5', monitor='val_acc', mode='max',
                                                verbose=1, save_best_only=True, save_weights_only=True)
# 使用 tensorboard 監控訓練過程
tb_callback = tf.keras.callbacks.TensorBoard(log_dir='logs')
# 在 patience 個 epochs 內，被監控的 val_loss 都沒有下降，那麼就降低 learning rate，新的值為 lr = factor * lr_old
lr_callback = tf.keras.callbacks.ReduceLROnPlateau(patience=3)

learning_rate = 0.001
learning_rate = K.variable(learning_rate)

# adam = tf.train.AdamOptimizer()
# # 在 tensorflow 1.10 版中，TFOptimizer 在 tensorflow.python.keras.optimizers 中可以找到，而 tensorflow.keras.optimizers 中沒有
# adam = TFOptimizer(adam)
# adam.lr = learning_rate

# adam with weight decay
adamw = AdamWOptimizer(weight_decay=1e-4)
adamw = TFOptimizer(adamw)
adamw.lr = learning_rate

model.compile(optimizer=adamw,
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

model.fit(x_train, y_train, epochs=10, validation_split=0.1, callbacks=[ck_callback, tb_callback, lr_callback])
print(model.evaluate(x_test, y_test))

用 TFOptimizer 包裹一層就行了，這樣在使用 tf.keras.callbacks.ReduceLROnPlateau() 時也沒有問題了。

在匯入 TFOptimizer 時，注意它所在的位置。1.10 版本的 tensorflow 匯入 keras 就有兩種方式——tensorflow.keras 和 tensorflow.python.keras，這樣其實有點混亂，而 TFOptimizer 的匯入只在後者能找到。（有點神奇。。。似乎 1.14 版本 tensorflow 去掉了第一種匯入方式，但 tensorflow 2.0 又有了。。。）

References

當前訓練神經網路最快的方式：AdamW優化演算法+超級收斂 -- 機器之心
 L2正則=Weight Decay？並不是這樣 -- 楊鎰銘
 ReduceLROnPlateau with native optimizer: 'TFOptimizer' object has no attribute 'lr' #20619

TensorFlow 2.0中的tf.keras和Keras有何區別？為什麼以後一定要用tf.keras？
2019-12-09
Keras
【tf.keras】tf.keras載入AlexNet預訓練模型
2019-05-29
Keras模型
tf.keras: 儲存與載入模型
2020-12-14
Keras模型
CV+NLP，使用tf.Keras構建影像搜尋引擎
2019-11-20
Keras
tf.keras實現邏輯迴歸和softmax多分類
2024-05-31
Keras邏輯迴歸
AI學習筆記——Tensorflow中的Optimizer(優化器)
2018-08-15
AI筆記優化
tf.keras實現線性迴歸和多層感知器
2024-05-31
Keras
SciTech-BigDataAIML-TensorFlow-Model的編譯：設定(LossFunction+Optimizer+Metrics)與編譯
2024-05-11
AI編譯Function
神經風格遷移：使用 tf.keras 和 Eager Execution，藉助深度學習創作藝術作品
2018-08-31
Keras深度學習
模型的列表定義中，使用函式時如何定義引數？
2020-04-04
模型函式
Shell中函式的定義和使用
2020-04-05
函式
C語言巨集定義中#define中的井號#的使用
2018-09-15
C語言
內部類中的成員的定義和使用
2024-06-24
使用DialogFragment定義自己的Dialog
2018-09-02
Fragment
變數的定義和使用
2020-11-22
變數
react中什麼使用定義變數，需要使用useRef，什麼時候直接定義即可？
2024-04-30
React變數
pytorch和tensorflow的愛恨情仇之定義可訓練的引數
2020-10-06
PyTorch
MySQL中變數的定義和變數的賦值使用
2019-07-12
MySql變數賦值
TensorFlow: 薛定諤的管道
2018-06-13
grafana如何使用定義的變數
2024-10-30
Grafana變數
google guava中定義的String操作
2019-02-25
GoGuava
tensorflow2 自定義損失函式使用的隱藏坑
2021-07-26
函式
Pytorch Optimizer類使用小技巧總結
2020-12-06
PyTorch
Spring基礎使用（三）-------XML定義AOP的使用
2021-09-09
SpringXML
vue常量定義以及使用
2024-03-08
Vue
使用flowable部署流程定義
2024-03-11
vue 常量定義和使用
2019-09-16
Vue
使用 defineNuxtComponent`定義 Vue 元件
2024-08-09
UXVue元件
MySql中執行計劃如何來的——Optimizer Trace
2023-04-26
MySql
C++中巨集定義#define的用法
2018-11-21
C++
Linux中chmod命令的定義及作用！
2023-09-22
Linux
Java / JavaScript在TensorFlow中的入門使用指南
2018-05-07
JavaScript
tensorflow：使用conda安裝tensorflow
2018-11-27
使用 TypeScript 定義業務字典
2023-01-17
TypeScript
優先定義，使用滯後
2020-11-23
Spring 定時器的使用—Xml、Annotation、自定義
2019-03-02
Spring定時器XML
Spring 定時器的使用---Xml、Annotation、自定義
2018-06-10
Spring定時器XML
【譯】使用 Room 定義物件間的關係
2019-06-08
OOM物件

【tf.keras】tf.keras使用tensorflow中定義的optimizer

References

相關文章