Spectral Matting的python實現

田神發表於2018-10-25

原文網址 : https://blog.csdn.net/streamrock/article/details/83216937

Spectral Matting是一個很好的矩陣分析、影像處理例子，是一個很好的學習例子。
在Github上，有Spectral Matting的MATLAB的實現，卻沒找到Python的實現，於是我參考了以下兩個程式碼源：
[1]https://github.com/yaksoy/SemanticSoftSegmentation
[2]https://github.com/MarcoForte/closed-form-matting
用python實現了一次，這次實踐過程收益良多。

[1]是語義軟分割（Yagiz Aksoy, Tae-Hyun Oh, Sylvain Paris, Marc Pollefeys and Wojciech Matusik, “Semantic Soft Segmentation”, ACM Transactions on Graphics (Proc. SIGGRAPH), 2018）的MATLAB實現，它的實現參考了Spectral Matting 原MATLAB的實現方式(http://www.vision.huji.ac.il/SpectralMatting/)，並在其上改動了一些部分，讓程式碼更為清晰，為此我選了這個版本作為python翻譯的參考。
[2]是A closed-form solution to natural image matting的python實現，Spectral Matting其實是closed-form matting的後續，它們在Laplacian矩陣生成的方式上，是完成相同的。
接下來，我將分以下幾部分來實現Spectral Matting：

Laplician矩陣的構建
特徵值與特徵向量選擇
優化——尋找合適的alpha
alpha的初始化
優化迭代的實現
Components的組合

1、Laplician矩陣的構建

根據[1]，影像的Laplicain矩陣中元素為：
$A_q(i,j)=\left\{ \begin{array} {cc} \delta_{ij}-\frac{1}{\vert w_q\vert}\left(1+(I_i-u_q)^T (\Sigma_q+\frac{\epsilon}{\vert w_q \vert}I_{3\times3}) (I_j-u_q) \right)&i,j\in w_q \\0&\text{otherwise} \end{array} \right.\qquad(1)$
其中， $w_q$ 是 $3\times3$ 窗體， $\vert w_q \vert$ 是窗體的pixels的數量， $\Sigma_q$ 是該窗體的 $3\times3$ 協方差， $I_i,I_j$ 是窗體中畫素i和畫素j的顏色值。
這部分的實現程式碼如下：

def _rolling_block(A, block=(3, 3)):
    """Applies sliding window to given matrix."""
    shape = (A.shape[0] - block[0] + 1, A.shape[1] - block[1] + 1) + block
    strides = (A.strides[0], A.strides[1]) + A.strides
    return as_strided(A, shape=shape, strides=strides)
    
def mattingAffinity(img, eps=10**(-7), win_rad=1):
    """Computes Matting Laplacian for a given image.
    Args:
        img: 3-dim numpy matrix with input image
        eps: regularization parameter controlling alpha smoothness
            from Eq. 12 of the original paper. Defaults to 1e-7.
        win_rad: radius of window used to build Matting Laplacian (i.e.
            radius of omega_k in Eq. 12).
    Returns: sparse matrix holding Matting Laplacian.
    """
    win_size = (win_rad * 2 + 1) ** 2
    h, w, d = img.shape
    # Number of window centre indices in h, w axes
    c_h, c_w = h - 2 * win_rad, w - 2 * win_rad
    win_diam = win_rad * 2 + 1

    indsM = np.arange(h * w).reshape((h, w))
    ravelImg = img.reshape(h * w, d)
    win_inds = _rolling_block(indsM, block=(win_diam, win_diam))

    #win_inds = win_inds.reshape(c_h, c_w, win_size)
    
    win_inds = win_inds.reshape(-1, win_size)
   
    winI = ravelImg[win_inds]

    win_mu = np.mean(winI, axis=1, keepdims=True)
    win_var = np.einsum('...ji,...jk ->...ik', winI, winI) / win_size - np.einsum('...ji,...jk ->...ik', win_mu, win_mu)

    inv = np.linalg.inv(win_var + (eps/win_size)*np.eye(3))

    X = np.einsum('...ij,...jk->...ik', winI - win_mu, inv)
    X = np.einsum('...ij,...kj->...ik', X, winI - win_mu)
    vals = (1.0/win_size)*(1 + X)

    nz_indsCol = np.tile(win_inds, win_size).ravel()
    nz_indsRow = np.repeat(win_inds, win_size).ravel()
    nz_indsVal = vals.ravel()
    A = scipy.sparse.coo_matrix((nz_indsVal, (nz_indsRow, nz_indsCol)), shape=(h*w, h*w))
    return A

def affinityMatrixToLaplacian(aff):
    degree = aff.sum(axis=1)
    degree = np.asarray(degree)
    degree = degree.ravel()
    Lap = scipy.sparse.diags(degree)-aff
    return Lap

其中_rolling_block可將影像按視窗要求（3*3）切成小塊，並返回窗體畫素所在矩陣的index，用於視窗的畫素提取。
公式(1)的具體實現時以下程式碼片段：

win_mu = np.mean(winI, axis=1, keepdims=True)
    win_var = np.einsum('...ji,...jk ->...ik', winI, winI) / win_size - np.einsum('...ji,...jk ->...ik', win_mu, win_mu)

    inv = np.linalg.inv(win_var + (eps/win_size)*np.eye(3))

    X = np.einsum('...ij,...jk->...ik', winI - win_mu, inv)
    X = np.einsum('...ij,...kj->...ik', X, winI - win_mu)
    vals = (1.0/win_size)*(1 + X)

此處用到np.einsum()函式，它可以完成複雜的張量計算。在得到各窗體計算結果後，由scipy.sparse.coo_matrix()構造稀疏矩陣L。
將前面幾個函式串起來，求影像的laplacian矩陣由以下程式碼片段實現，其中影像來自：
https://github.com/yaksoy/SemanticSoftSegmentation/blob/master/docia.png

image = Image.open('docia.png')
img_s = image.resize((160,80))
im_np = np.asarray(img_s)
w = im_np.shape[0]
img = im_np[:,:w,:]/255.0

A = mattingAffinity(img)
L = affinityMatrixToLaplacian(A)

在這裡插入圖片描述
圖1 原圖
影像讀出來後轉換為numpy的array型別，將它作為mattingAffinity()的輸入，計算親和矩陣A，和Laplician矩陣L。

2、特徵值與特徵向量選擇

[1]的一個關鍵思路是：在影像分割後，能得到若干個components（比如是10個），它們可由最小特徵向量（比如是50個）的線性組合得到，即：
$\mathbf \alpha^k=\mathbf E\mathbf y^k\qquad(2)$
其中， $\mathbf E$ 是特徵向量（比如是，最小的50個特徵向量）構成的矩陣，為對應我們的上述程式碼，其形狀為 $6400\times50$ ，6400是影像畫素的總數（ $80\times80$ ），50是最小特徵向量的數量。 $\mathbf y^k$ 是第k個component對應的線性組合，其形狀為 $50\times1$ ，而 $\alpha^k$ 是第k個component對應的alpha值，是 $6400\times 1$ 的向量。
以下是實現的程式碼：
1）求出L矩陣最小特徵值對應的特徵向量；
2）並從大到小進行排序。

eigVals, eigVecs = scipy.sparse.linalg.eigs(L, k=50, which='SM')

eigVals = np.abs(eigVals)
index_sorted = np.argsort(-eigVals)  #逆序排列
eigVals = eigVals[index_sorted]
eigVecs = eigVecs[:,index_sorted]

X = eigVecs
X = np.sign(np.real(X))*np.abs(X)
eigVecs = X

因為，L是稀疏矩陣（sparse matrix），因此需要用scipy.sparse.linalg.eigs()求解其特徵值和對應的特徵向量，但它求出的向量沒有進行排序，因此在此還需要對特徵值以小到大進行排序。
這裡用scipy.sparse.linalg.eigs代替MATLAB的eigs，可充分體現MATLAB的高效，它的計算速度明顯優於scipy。
這50個特徵向量顯示出來就是這樣的：
在這裡插入圖片描述
圖2 50個最小特徵向量
顯示程式碼如下：

fig, ax = plt.subplots(10,5, sharex=True, sharey=True)
fig.set_size_inches((15,15))

for i in range(50):
    row = int(i/5)
    col = int(i%5)
    eigVec = eigVecs[:,i]
    eigVec = np.reshape(eigVec, (w,w))
    ax[row, col].imshow(eigVec)
plt.show()

回顧公式（2）：
$\mathbf \alpha^k=\mathbf E\mathbf y^k\qquad(2)$
$\mathbf E$ 就是由這些特徵向量構成的。

3、優化——尋找合適的alpha

理想的 $\alpha$ 是由0-1組成的向量：若影像可以被分割成K個獨立component的話，則每個獨立component對應的 $\alpha^k,k\in\{1,\dots,K\}$ ，在理想情況下，各分量（ $\alpha_i^k$ ）應趨向0（完全不透明）或1（完全透明）。
為達到這個目標，需要找到合適的線性組合加權，即(2)中的 $\mathbf y$ 。為達到這個目的，[1]藉助了一個牽引函式來實現：
$\rho(\alpha)= \vert \alpha\vert^{\gamma}+\vert 1-\alpha\vert^{\gamma}$
它的影像如下，
在這裡插入圖片描述
圖3 牽引函式影像
從圖3可以看到該函式有將變數向0或1牽引的能力，於是[1]將優化的目標函式設計為：
$\sum_{i,k}\vert \alpha_i^k\vert^{\gamma}+\vert 1-\alpha_i^k\vert^{\gamma} \text{ ,where $\alpha^k=Ey^k$}\\ \text{Subject to, } \sum_k \alpha_i^k=1 \qquad(3)$
(3)式中， $i$ 是畫素index， $k$ 是component的index。 $E$ 是特徵向量構成的矩陣， $y^k$ 是 $\alpha^k$ 的加權向量，與(2)式定義相同。在某一個畫素（畫素 $i$ ）上，眾components在該點的 $\alpha_i^k$ 的和等於1，即: $\sum_k \alpha_i^k=1$ 。

為找到(3)式的最小值，[1]用Newton法迭代求解，並對(3)式進行二次型近似，有：
$\sum_{i,j}u_i^k\vert \alpha_i^k\vert^2+v_i^k\vert 1-\alpha_i^k\vert^2 \text{ ,where $\alpha^k=Ey^k$}\qquad(4) \\ u_i^k \sim \vert \alpha_i^k\vert^{\gamma-2},v_i^k \sim \vert 1-\alpha_i^k\vert^{\gamma-2} \\ . \\ \text{Subject to, } \sum_k \alpha_i^k=1$
其中， $\sim$ 表示正比關係，(3)式被轉化為(4)，即二次型最優化問題，當 $u_i^k$ 和 $v_i^k$ 一定時，（4）的最優解有閉式解，它相當於求解下列線性方程的解：
$\left[\begin{array}{ccccc} I_d&I_d&I_d&\cdots&I_d\\ 0&W^1+W^K&W^K&\cdots&W^K\\ 0&W^K&W^2+W^K&\cdots&W^K\\ \vdots&\vdots&\vdots&\ddots&W^K\\ 0&W^K&W^K&\cdots&W^{K-1}+W^K \end{array} \right]\mathbf y=\left[\begin{array}{c} E\mathbf 1\\ Ev^1+Eu^K\\Ev^2+Eu^K\\ \vdots \\ Ev^{K-1}+Eu^K \end{array} \right]\qquad(5)$
其中， $\mathbf u^k$ 是一個由 $u_i^k$ 組成的向量，其形狀是： $6400\times 1$ ，同理， $\mathbf v^k$ 是一個由 $v_i^k$ 組成的向量。
$U^k=diag(\mathbf u^k),V^k=diag(\mathbf v^k)$ ，
$W^k=E^T(U^k+V^k)E$ ， $U^k=diag(\mathbf u^k),V^k=diag(\mathbf v^k)$ ， $E$ 是特徵向量組成的矩陣，矩陣形狀計算： $50\times 6400 \cdot 6400\times 6400\cdot 6400\times 50=50\times 50$
$I_d$ 表示與 $W^k$ 形狀相同的單位矩陣。
優化迭代過程如下：

取 $\alpha$ 的一個初始值 $\alpha_0$ ，根據它計算 $U_0,V_0$
根據 $U_0,V_0$ 計算 $W_0$ ，從而得到公式（5）線性方程組；
求解方程組，得到 $y_0$ ，
由 $\mathbf \alpha_1=\mathbf E\mathbf y_0$ ，得到新的 $\alpha$ ，進入下一次新的迭代。

4、alpha的初始化

由於求解過程非凸，迭代初值對收斂結果有很大影響，必須從一個合法的alpha開始，以下是初始化的實現：

從特徵向量中挑選40個進行聚類，聚類數量是20。
將這些聚類中心投影到選定的component上，得到alpha的初值。

maxIter = 20
sparsityParam = 0.1
iterCnt = 20

initialSegmCnt = 40
compCnt = 20

initEigsCnt = 20
eigVals=np.diag(eigVals)

initEigsWeights = np.diag(1/np.diag(eigVals[1: 1+initEigsCnt , 1 : 1+initEigsCnt])**0.5)
initEigVecs = np.matmul(eigVecs[:, 1 : 1+initEigsCnt] , initEigsWeights)

def fastKmeans(X, K):
    # X: points in the N-by-P data matrix
    # K: number of clusters
    maxIters = 100
    X = np.sign(np.real(X))*np.abs(X)
    X = X.transpose()
    center, _ = kmeans(X, K, iter=maxIters)
    center = center.transpose()
    return center

# 求聚類
initialSegments = fastKmeans(initEigVecs, compCnt)
# 投影
softSegments = np.zeros([len(initialSegments), compCnt])
for i in range(compCnt):
    softSegments[:, i] = np.float32(initialSegments == i)

得到的初值如圖所示：
在這裡插入圖片描述
圖4 作為初值的alpha
alpha的初值每個分量都由0-1構成，可稱為硬分割。

5、優化的迭代實現

迭代的過程主要是構造公式（5），其具體程式碼如下：

sparsityParam = 0.8
spMat = sparsityParam
thr_e = 1e-10

w1 = 0.3;
w0 = 0.3;

e1 = (w1**sparsityParam) * (np.maximum(np.abs(softSegments-1), thr_e)**(spMat - 2))
e0 = (w0**sparsityParam) * (np.maximum(np.abs(softSegments), thr_e)**(spMat - 2))

scld = 1
eigValCnt = eigVecs.shape[1]
eigVals = np.diag(eigVals)

eig_vectors = eigVecs[:, : eigValCnt]
eig_values = eigVals[: eigValCnt, : eigValCnt]
#print(eig_vectors.shape, eig_values.shape)

# First iter no far removing zero components
removeIter = np.ceil(maxIter/4)
removeIterCycle = np.ceil(maxIter/4)

for it in range(10):
    '''
    tA = zeros((compCnt - 1) * eigValCnt);
    tb = zeros((compCnt - 1) * eigValCnt, 1);
    for k = 1 : compCnt - 1
        weighted_eigs = repmat(e1(:, k) + e0(:, k), 1, eigValCnt) .* eig_vectors;
        tA((k-1) * eigValCnt + 1 : k * eigValCnt, (k-1) * eigValCnt + 1 : k * eigValCnt) 
            = eig_vectors' * weighted_eigs + scld * eig_values;
        tb((k-1) * eigValCnt + 1 : k * eigValCnt) = eig_vectors' * e1(:,k);
    end 
    '''
    
    # Construct the matrices in Eq(9) in Spectral Matting
    tA = np.zeros([(compCnt-1) * eigValCnt,(compCnt-1) * eigValCnt])
    tb = np.zeros([(compCnt-1) * eigValCnt, 1])
    for k in range(compCnt-1):
        weighted_eigs = np.tile((e1[:, k] + e0[:, k]).reshape([-1,1]), 
                                [1, eigValCnt])*eig_vectors
        tA[k * eigValCnt: (k+1) * eigValCnt, k * eigValCnt: (k+1) * eigValCnt] = \
                np.matmul(eig_vectors.transpose(),weighted_eigs) + scld * eig_values
        #tA((k-1) * eigValCnt + 1 : k * eigValCnt, (k-1) * eigValCnt + 1 : k * eigValCnt) = eig_vectors' * weighted_eigs + scld * eig_values;
        tb[k * eigValCnt: (k+1) * eigValCnt] = \
                np.matmul(eig_vectors.transpose() ,e1[:,k].reshape([-1,1]))
    
    '''
    k = compCnt;
    weighted_eigs = repmat(e1(:, k) + e0(:, k), 1, eigValCnt) .* eig_vectors;
    ttA = eig_vectors' * weighted_eigs + scld * eig_values;
    ttb = eig_vectors' * e0(:, k) + scld * sum(eig_vectors' * Laplacian, 2);
    '''
    k = compCnt-1
    weighted_eigs = np.tile((e1[:, k] + e0[:, k]).reshape([-1,1]), 
                            [1, eigValCnt])*eig_vectors
    ttA = np.matmul(eig_vectors.transpose() ,weighted_eigs) + scld * eig_values
    ttb = np.matmul(eig_vectors.transpose() ,e0[:,k].reshape([-1,1]))
    ttb = ttb + np.sum(np.matmul(eig_vectors.transpose(), 
                                 L.toarray()),axis=1,keepdims=True)

    '''
    tA = tA + repmat(ttA, [compCnt - 1, compCnt - 1]);
    tb = tb + repmat(ttb, [compCnt - 1, 1]);
    '''
    tA = tA + np.tile(ttA, [compCnt-1, compCnt-1])
    tb = tb + np.tile(ttb, [compCnt-1, 1])

    # Solve for weights
    # y = reshape(tA \ tb, eigValCnt, compCnt - 1);
    #y = np.matmul(scipy.linalg.inv(tA),tb).reshape([eigValCnt, compCnt - 1])
    # these two methods are same
    #y = scipy.linalg.solve(tA, tb).reshape([eigValCnt, compCnt - 1])
    y = scipy.linalg.solve(tA, tb).reshape([eigValCnt,compCnt - 1], order='F')
    #print(y.shape)
    
    '''
    % Compute the matting comps from weights
    softSegments = eigVecs(:, 1 : eigValCnt) * y;
    softSegments(:, compCnt) = 1 - sum(softSegments(:, 1 : compCnt - 1), 2); 
    % Sets the last one as 1-sum(others), guaranteeing \sum(all) = 1
    '''
    # Compute the matting comps from weights
    softSegments[:,:compCnt-1] = np.matmul(eigVecs[:, : eigValCnt], y)
    # Sets the last one as 1-sum(others), guaranteeing \sum(all) = 1
    softSegments[:, compCnt-1] = 1 - np.sum(softSegments[:, :compCnt-1], axis=1)

    # Remove matting components which are close to zero every once in a while
    if it > removeIter:
        indx = np.argwhere(np.max(np.abs(softSegments),axis=0)>0.1)
        compCnt = len(indx)
        indx = np.squeeze(indx)
        softSegments = softSegments[:,indx]
        removeIter = removeIter + removeIterCycle
        print('After, has components: ' + str(softSegments.shape))
    
    '''
    % Recompute the derivatives of sparsity penalties
    e1 = w1 .^ sparsityParam * max(abs(softSegments-1), thr_e) .^ (spMat - 2);
    e0 = w0 .^ sparsityParam * max(abs(softSegments), thr_e) .^ (spMat - 2);   
    '''
    e1 = (w1**sparsityParam) * (np.maximum(np.abs(softSegments-1), thr_e)**(spMat - 2))
    e0 = (w0**sparsityParam) * (np.maximum(np.abs(softSegments), thr_e)**(spMat - 2))

此時的alpha是這樣的：
在這裡插入圖片描述
圖5 經迭代後的軟分割alpha
此時的alpha並非全0-1，還有其他的一些值，反映到圖5中，是顏色的漸變，不同於圖4的硬分割。

6、Components的組合

由於我們得到的components並不一定就是前景（foreground）或背景（background），而前景/背景則是由Components組合而成的。假設前景為：
$\alpha = \alpha^{k_1}+\alpha^{k_2}+\cdots+\alpha^{k_n}\qquad(6)$
它（ $\alpha$ ）由若干個component( $\{\alpha^{k_1},\alpha^{k_2},\cdots,\alpha^{k_n}\}$ )直接和而得到。現在問題轉變為找哪幾個Components來求和。[1]採用了求components之間相關性，然後列舉各種組合的score的方法來實現Unsupervised Matting，具體方法如下：
1）求相關性矩陣
$\Phi(k,l)=(\alpha^k)^TL\alpha^l\qquad(7)$
其中， $\Phi(k,l)$ 是相關性矩陣 $\Phi$ 的位於k，l位置的元素，該元素是由第k個component的 $\alpha^k$ 與第l個component的 $\alpha^l$ ，及L運算所得。
2）求組合score
有了相關性矩陣後，可定義組合score為：
$J(\alpha) = b^T\Phi b\qquad(8)$
b是示性向量，由0-1構成，反映component參與疊加與否。若有K個components則b的組合數為 $2^K$ ，挑出取得最大值的那個或那些，則是matting的結果。

[1]Spectral Matting（http://www.vision.huji.ac.il/SpectralMatting/）

Spectral subtrative 譜減法降噪 python 程式碼
2020-12-24
Python
switch的python實現
2019-03-04
Python
譜範數正則（Spectral Norm Regularization）的理解
2018-10-30
ORM
譜圖（Spectral Graph Theory）理解（2）
2018-09-19
Graph Theory
Go 實現的 python collections
2019-08-19
GoPython
python 介面實現類的
2024-08-22
Python
Python實現的快速排序
2018-03-28
Python排序
Python實現火柴人的設計與實現
2024-10-18
Python
Python之禪-import this的實現
2018-08-10
PythonImport
Python 3.6:多型的實現
2018-03-15
Python多型
譜聚類的python實現
2020-08-23
聚類Python
python OpenCV加法操作的實現
2021-09-11
PythonOpenCV
Python中迭代器的實現
2021-09-11
Python
大頂堆的python實現
2020-11-29
Python
LRU Cache的原理和python的實現
2021-09-09
Python
堆和索引堆的python實現
2019-02-16
索引Python
python的爬蟲功能如何實現
2019-02-28
Python爬蟲
python執行緒池的實現
2018-10-07
Python執行緒
python實現郵件的傳送
2018-04-09
Python
Python的類及單例實現
2018-08-30
Python單例
Python 雜湊表的實現——字典
2023-11-24
Python
基於Python的Xgboost模型實現
2020-04-03
Python模型
快速排序的四種python實現
2019-02-17
排序Python
Python爬蟲是如何實現的？
2022-07-15
Python爬蟲
python dict實現的魔法方法
2021-09-11
Python
論文翻譯：Real-Time High-Resolution Background Matting
2020-12-22
Python實現微博輿情分析的設計與實現
2024-10-25
Python
字母排列（python實現）
2018-05-16
Python
Python 字典實現原理
2020-11-22
Python
Python實現建立字典
2019-02-05
Python
C均值聚類 C實現 Python實現
2020-12-05
聚類Python
python是如何實現生成器的
2018-10-29
Python
Python 爬蟲IP代理池的實現
2018-12-17
Python爬蟲
python 程式池的兩種不同實現
2023-05-11
Python
用切片操作實現的Python篩法
2019-12-28
Python
【Python】Python實現解壓rar檔案
2024-06-15
Python
投擲骰子 -Python實現
2018-11-03
Python
Python實現氣泡排序
2018-11-18
Python排序