偏相關係數計算
參考
陳彥光編著. 地理數學方法及其應用. 2008, 北京大學城市與環境學院.
維基百科
簡單相關係數旨在反映變數之間兩兩線性關係,但實際上,每一個簡單相關係數不可能絕對不包括其他因素的相關成分。為了克服簡單相關係數的間接相關資訊,有人設計了另一種檢驗指標,稱為偏相關係數( partial correlation coefficient)。偏相關係數旨在排除其它因素的影響,單純反映某個自變數與因變數之間的密切程度。
當自變數較多時,利用公式計算偏相關係數相當麻煩,比較便捷的方式是藉助簡單相關係數構成的相關矩陣進行運算,計算公式如下:
\(R_{x_{j} y}=\frac{-c_{j y}}{\sqrt{c_{j j} c_{y y}}}\)
這裡\(R_{x_{j} y}\) 為第 j 個自變數與因變數 y 的偏相關係數, c 為相關係數矩陣的逆矩陣中對應的元素。
下面是python實現
# -*- coding: utf-8 -*-
"""
Created on Mon Dec 20 16:53:39 2021
modified: https://gist.github.com/fabianp/9396204419c7b638d38f
@author: pan
"""
import numpy as np
from numpy.linalg import inv
from osgeo import gdal, gdal_array
import os, time,glob
from sklearn import linear_model
from sklearn import preprocessing
from matplotlib import pyplot as plt
def partial_corr(C):
"""
Returns the sample linear partial correlation coefficients between pairs of variables in C, controlling
for the remaining variables in C.
Parameters
----------
C : array-like, shape (n, p)
Array with the different variables. Each column of C is taken as a variable
Returns
-------
P_corr : array-like, shape (p, p)
P_corr[i, j] contains the partial correlation of C[:, i] and C[:, j] controlling
for the remaining variables in C.
"""
C = np.asarray(C)
p = C.shape[1]
P_corr = np.zeros((p, p)) # sample linear partial correlation coefficients
corr = np.corrcoef(C,rowvar=False) # Pearson product-moment correlation coefficients.
corr_inv = inv(corr) # the (multiplicative) inverse of a matrix.
for i in range(p):
P_corr[i, i] = 1
for j in range(i+1, p):
pcorr_ij = -corr_inv[i,j]/(np.sqrt(corr_inv[i,i]*corr_inv[j,j]))
P_corr[i,j]=pcorr_ij
P_corr[j,i]=pcorr_ij
return P_corr