💡 次梯度的定義
We say a vector \(g\in \mathbb{R}^n\) is a subgradient of \(f:\mathbb{R}^n\to \mathbb{R}\) at \(x\in \operatorname{\textbf{dom}} f\) if for all \(y\in \operatorname{\textbf{dom}} f\),
\[ f(y)\ge f(x) + g^T(y-x).
\]
The set \(\partial f(x) = \{g|~f(y)\ge f(x) + g^T(y-x)\}\) is a subdifferential of \(f\) at x.