Optimizer: SGD, SGD with Momentum, Adagrad, RMSProp, Adam, AdamW

ForHHeart發表於2024-03-18

🥥 Table of Content

Resource 1: 最佳化器|SGD|Momentum|Adagrad|RMSProp|Adam - Bilibili
Resource 2: AdamW and Adam with weight decay
Resource 3: 非凸函式上,隨機梯度下降能否收斂?網友熱議:能,但有條件,且比凸函式收斂更難 - 機器之心

  • Gradient Descent
  • SGD
  • SGD with Momentum
  • Adagrad
  • RMSProp
  • Adam
  • AdamW
Optimizer: SGD, SGD with Momentum, Adagrad, RMSProp, Adam, AdamW

🥑 Get Started!

Gradient Descent


SGD

import torch
from torch.optim import SGD

optimizer = SGD(model.parameters(), lr=0.01)

SGD with Momentum

import torch
from torch.optim import SGD

optimizer = SGD(model.parameters(), lr=0.01, momentum=0.9)

Adagrad



RMSProp



Adam



AdamW


相關文章