機器學習演算法(22)python實現用scikit-learn進行全連線的凝聚層次聚類演算法(Agglo-merative-Clustering)

@糯米君發表於2020-11-29

scikit-learn實現的 Agglo-merative-Clustering允許我們選擇要返回的叢集數量。這對修剪層次結構的叢集樹很有用。

from sklearn.cluster import AgglomerativeClustering
import numpy as np
import pandas as pd

# 隨機生成樣本
np.random.seed(123)
variables = ['X', 'Y', 'Z']
labels = ['ID_0', 'ID_1', 'ID_2', 'ID_3', 'ID_4']

X = np.random.random_sample([5, 3])*10
df = pd.DataFrame(X, columns=variables, index=labels)
print(df)

ac = AgglomerativeClustering(n_clusters=3,
                             affinity='euclidean',
                             linkage='complete')
labels = ac.fit_predict(X)
print('Cluster labels: %s' % labels)

執行結果:
X Y Z
ID_0 6.964692 2.861393 2.268515
ID_1 5.513148 7.194690 4.231065
ID_2 9.807642 6.848297 4.809319
ID_3 3.921175 3.431780 7.290497
ID_4 4.385722 0.596779 3.980443
Cluster labels: [1 0 0 2 1]

相關文章