Laplace Approximation:

jerry173985發表於2020-12-03

Fit a Gaussian to p(w|D)

log p(w|D) = log p(w,D) +const wrt. w

Quadratic in w if Gaussian

Find mode and fund 2nd derivative

“Energy” E(w) = -log( p(w,D) )
w* = argmin E(w) (L2 regularization term or MAP fit)

Hessian Hij (| w*)

p(w|D) ~= N(w; w*, H^-1)

Approximate p(D|M)

log p(w|D) = log p(w,D) - log p(D) ~= N(w; w*, H^-1)

= |H|^1/2 / (2pi)^b/2 exp(-1/2(w-w*)^T H(w-w^))

Evaluate the approximation at w=w*

p(w*,D) / p(D) ~= |H|^1/2 / (2pi)^b/2

p(D) refer to training data
b (actually it’s D) refer number of parameters

p(D) = p(w*,D) / p(w*| D) ~= p(w*,D) / |H|^1/2 * (2pi)^b/2

We can approximate p(D) for different models and choose the model with the highest marginal likelihood

Could go wrong if the approximation with the Gaussian is a poor fit!