Sherman-Morrison公式及其應用

jclian91發表於2018-05-09

Sherman-Morrison公式

  Sherman-Morrison公式以 Jack Sherman 和 Winifred J. Morrison命名,線上性代數中,是求解逆矩陣的一種方法。本篇部落格將介紹該公式及其應用,首先我們來看一下該公式的內容及其證明。
  (Sherman-Morrison公式)假設ARn×nA∈Rn×n為可逆矩陣,u,vRnu,v∈Rn為列向量,則A+uvTA+uvT可逆當且僅當1+vTA1u01+vTA−1u≠0, 且當A+uvTA+uvT可逆時,該逆矩陣由以下公式給出:

(A+uvT)1=A1A1uvTA11+vTA1u.(A+uvT)−1=A−1−A−1uvTA−11+vTA−1u.

證明:
()(⇐)1+vTA1u01+vTA−1u≠0時,令X=A+uvT,Y=A1A1uvTA11+vTA1uX=A+uvT,Y=A−1−A−1uvTA−11+vTA−1u,則只需證明XY=YX=IXY=YX=I即可,其中II為n階單位矩陣。

XY=(A+uvT)(A1A1uvTA11+vTA1u)=AA1+uvTA1AA1uvTA1+uvTA1uvTA11+vTA1u=I+uvTA1uvTA1+uvTA1uvTA11+vTA1u=I+uvTA1u(1+vTA1u)vTA11+vTA1u=I+uvTA1uvTA1=IXY=(A+uvT)(A−1−A−1uvTA−11+vTA−1u)=AA−1+uvTA−1−AA−1uvTA−1+uvTA−1uvTA−11+vTA−1u=I+uvTA−1−uvTA−1+uvTA−1uvTA−11+vTA−1u=I+uvTA−1−u(1+vTA−1u)vTA−11+vTA−1u=I+uvTA−1−uvTA−1=I

同理,有YX=IYX=I.因此,當1+vTA1u01+vTA−1u≠0時,(A+uvT)1=A1A1uvTA11+vTA1u.(A+uvT)−1=A−1−A−1uvTA−11+vTA−1u.
()(⇒)u=0u=0時,顯然有1+vTA1u=10.1+vTA−1u=1≠0.u0u≠0時,用反正法證明該命題成立。假設A+uvTA+uvT可逆,但1+vTA1u=01+vTA−1u=0,則有

(A+uvT)A1u=u+u(vTA1u)=(1+vTA1u)u=0.(A+uvT)A−1u=u+u(vTA−1u)=(1+vTA−1u)u=0.

因為A+uvTA+uvT可逆,故A1A−1u=0,又因為A1A−1可逆,故u=0u=0,此與假設u0u≠0矛盾。因此,當A+uvTA+uvT可逆時,有1+vTA1u0.1+vTA−1u≠0.

Sherman-Morrison公式的應用

應用1:A=IA=I時的Sherman-Morrison公式

  在Sherman-Morrison公式中,令A=IA=I,則有:I+uvTI+uvT可逆當且僅當1+vTu01+vTu≠0, 且當I+uvTI+uvT可逆時,該逆矩陣由以下公式給出:

(I+uvT)1=IuvT1+vTu.(I+uvT)−1=I−uvT1+vTu.

  再令v=uv=u,則1+uTu>01+uTu>0, 因此,I+uuTI+uuT可逆,且

(I+uuT)1=IuuT1+uTu.(I+uuT)−1=I−uuT1+uTu.

應用2:BFGS演算法

  Sherman-Morrison公式在BFGS演算法中的應用,可用來求解BFGS演算法中近似Hessian矩陣的逆。本篇部落格並不打算給出Sherman-Morrison公式在BFGS演算法中的應用,將會再寫篇部落格介紹BFGS演算法,到時再給出該公式的應用,並會在之後補上該部落格的連結(因為筆者還沒寫)。

應用3:迴圈三對角線性方程組的求解

  本篇部落格將詳細講述Sherman-Morrison公式在迴圈三對角線性方程組的求解中的應用。
  首先給給出理論知識介紹部分。
  對於ARn×nA∈Rn×n為可逆矩陣,u,vRnu,v∈Rn為列向量,1+vTA1u01+vTA−1u≠0,需要求解方程(A+uvT)x=b.(A+uvT)x=b.對此,我們可以先求解以下兩個方程:

Ay=b,Az=uAy=b,Az=u

.
然後令x=yvTy1+vTzzx=y−vTy1+vTzz,該解即為原方程的解,驗證如下:

(A+uvT)x=(A+uvT)(yvTy1+vTzz)=Ay+uvTyvTy1+vTzAzvTy1+vTzuvTz=b+uvTyvTyu+vTyuvTz1+vTz=b+uvTy(1+vTz)vTyu1+vTz=b+uvTyuvTy=b(A+uvT)x=(A+uvT)(y−vTy1+vTzz)=Ay+uvTy−vTy1+vTzAz−vTy1+vTzuvTz=b+uvTy−vTyu+vTyuvTz1+vTz=b+uvTy−(1+vTz)vTyu1+vTz=b+uvTy−uvTy=b

  這樣將原方程(A+uvT)x=b(A+uvT)x=b分成兩個方程,可以在一定程度上簡化原方程。接下來,我們將介紹迴圈三對角線性方程組的求解。
  所謂迴圈三對角線性方程組,指的是係數矩陣為如下形式:

A=b1a200cnc1b200c2an20bn2an1000cn2bn1ana100cn1bnA=[b1c10⋯0a1a2b2c20⋮00⋱⋱⋱0⋮⋮⋮an−2bn−2cn−200⋯⋯an−1bn−1cn−1cn0⋯0anbn]

迴圈三對角線性方程組可寫成Ax=dAx=d,其中d=(d1,d2,...,dn)T.d=(d1,d2,…,dn)T.
  對於此方程的求解,我們令u=(γ,0,0,...,cn)T,v=(1,0,0,...,a1γ)Tu=(γ,0,0,…,cn)T,v=(1,0,0,…,a1γ)T, 且A=A+uvTA=A′+uvT,其中AA′如下:

A=b1γa2000c1b200c2an20bn2an1000cn2bn1an000cn1bna1cnγA′=[b1−γc10⋯00a2b2c20⋮00⋱⋱⋱0⋮⋮⋮an−2bn−2cn−200⋯⋯an−1bn−1cn−100⋯0anbn−a1cnγ]

AA′為三對角矩陣。根據以上的理論知識,我們只需要求解以下兩個方程

Ay=d,Az=uA′y=d,A′z=u,

然後,就能根據y,zy,z求出xx.而以上兩個方程為三對角線性方程組,可以用追趕法(或Thomas法)求解,具體演算法可以參考部落格:三對角線性方程組(tridiagonal systems of equations)的求解 。
  綜上,我們利用Sherman-Morrison公式的思想,可以將迴圈三對角線性方程組轉化為三對角線性方程組求解。我們將會在下面給出該演算法的Python語言實現。

Python實現

  我們要解的迴圈三對角線性方程組如下:

4100314100014100014120014x1x2x3x4x5=76668[4100214100014100014130014][x1x2x3x4x5]=[76668]

  用Python實現解該方程的Python完整程式碼如下:

# use Sherman-Morrison Formula and Thomas Method to solve cyclic tridiagonal linear equation

import numpy as np

# Thomas Method for soling tridiagonal linear equation Ax=d
# parameter: a,b,c,d are list-like of same length
# b: main diagonal of matrix A
# a: main diagonal below of matrix A
# c: main diagonal upper of matrix A
# d: Ax=d
# return: x(type=list), the solution of Ax=d
def TDMA(a,b,c,d):

    try:
        n = len(d)    # order of tridiagonal square matrix

        # use a,b,c to create matrix A, which is not necessary in the algorithm
        A = np.array([[0]*n]*n, dtype=`float64`)

        for i in range(n):
            A[i,i] = b[i]
            if i > 0:
                A[i, i-1] = a[i]
            if i < n-1:
                A[i, i+1] = c[i]

        # new list of modified coefficients
        c_1 = [0]*n
        d_1 = [0]*n

        for i in range(n):
            if not i:
                c_1[i] = c[i]/b[i]
                d_1[i] = d[i] / b[i]
            else:
                c_1[i] = c[i]/(b[i]-c_1[i-1]*a[i])
                d_1[i] = (d[i]-d_1[i-1]*a[i])/(b[i]-c_1[i-1] * a[i])

        # x: solution of Ax=d
        x = [0]*n

        for i in range(n-1, -1, -1):
            if i == n-1:
                x[i] = d_1[i]
            else:
                x[i] = d_1[i]-c_1[i]*x[i+1]

        x = [round(_, 4) for _ in x]

        return x

    except Exception as e:
        return e

# Sherman-Morrison Fomula for soling cyclic tridiagonal linear equation Ax=d
# parameter: a,b,c,d are list-like of same length
# b: main diagonal of matrix A
# a: main diagonal below of matrix A
# c: main diagonal upper of matrix A
# d: Ax=d
# return: x(type=list), the solution of Ax=d
def Cyclic_Tridiagnoal_Linear_Equation(a,b,c,d):
    try:
        # use a,b,c to create cyclic tridiagonal matrix A
        n = len(d)
        A = np.array([[0] * n] * n, dtype=`float64`)

        for i in range(n):
            A[i, i] = b[i]
            if i > 0:
                A[i, i - 1] = a[i]
            if i < n - 1:
                A[i, i + 1] = c[i]
        A[0, n - 1] = a[0]
        A[n - 1, 0] = c[n - 1]

        gamma = 1 # gamma can be set freely
        u = [gamma] + [0] * (n - 2) + [c[n - 1]]
        v = [1] + [0] * (n - 2) + [a[0] / gamma]

        # modify the coefficient to form A`
        b[0] -= gamma
        b[n - 1] -= a[0] * c[n - 1] / gamma
        a[0] = 0
        c[n - 1] = 0

        # solve A`y=d, A`z=u by using Thomas Method
        y = np.array(TDMA(a, b, c, d))
        z = np.array(TDMA(a, b, c, u))

        # use y and z to calculate x
        # x = y-(v·y)/(1+v·z) *z
        # x is the solution of Ax=d
        x = y - (np.dot(np.array(v), y)) / (1 + np.dot(np.array(v), z)) * z

        x = [round(_, 3) for _ in x]

        return x

    except Exception as e:
        return e

def main():
    ```
       equation:
       A = [[4,1,0,0,2],
            [1,4,1,0,0],
            [0,1,4,1,0],
            [0,0,1,4,1],
            [3,0,0,1,4]]
       d = [7,6,6,6,8]
       solution x should be [1,1,1,1,1]
    ```

    a = [2, 1, 1, 1, 1]
    b = [4, 4, 4, 4, 4]
    c = [1, 1, 1, 1, 3]
    d = [7, 6, 6, 6, 8]

    x = Cyclic_Tridiagnoal_Linear_Equation(a,b,c,d)
    print(`The solution is %s`%x)

main()

輸出結果如下:

The solution is [1.0, 1.0, 1.0, 1.0, 1.0]

參考文獻

  1. https://en.wikipedia.org/wiki/Sherman%E2%80%93Morrison_formula
  2. http://wwwmayr.in.tum.de/konferenzen/Jass09/courses/2/Soldatenko_paper.pdf
  3. https://scicomp.stackexchange.com/questions/10137/solving-system-of-linear-equations-with-cyclic-tridiagonal-matrix
  4. https://blog.csdn.net/jclian91/article/details/80251244


相關文章