1 查询:matrix 相关知识
例子:
print("Homogeneity: %0.3f" % metrics.homogeneity_score(labels_true, labels))
print("Completeness: %0.3f" % metrics.completeness_score(labels_true, labels))
print("V-measure: %0.3f" % metrics.v_measure_score(labels_true, labels))
print("Adjusted Rand Index: %0.3f"
% metrics.adjusted_rand_score(labels_true, labels))
print("Adjusted Mutual Information: %0.3f"
% metrics.adjusted_mutual_info_score(labels_true, labels))
print("Silhouette Coefficient: %0.3f"
% metrics.silhouette_score(X, labels))
out:
Homogeneity: 0.975
Completeness: 0.935
V-measure: 0.955
Adjusted Rand Index: 0.976
Adjusted Mutual Information: 0.935
Silhouette Coefficient: 0.661
2 查询相关文档 学会cluster metrics 看懂dataset中的东西 preprocessing
from sklearn.cluster import DBSCAN
from sklearn import metrics
from sklearn.datasets.samples_generator import make_blobs
from sklearn.preprocessing import StandardScaler
产生一些随机样本点 中心是centers 750个
centers =[[0,0],[1,2],[3,-1]]
X,labels_true = make_blobs(n_samples =750 , centers =centers,cluster_std=0.4,random_state=0)
转换X
X = StandardScaler().fit_transform(X)
转换之前
[[-0.15977961 0.14802236]
[ 0.84525166 1.7958829 ]
[-0.32136387 -0.27581991]
…,
[ 2.26798858 -1.27833405]
[ 1.11371187 2.69706751]
[ 2.60046048 -1.29605472]]
转换之后
[[-1.11638887 -0.13446227]
[-0.361879 1.13162887]
[-1.23769546 -0.46011054]
…,
[ 0.70621616 -1.23036641]
[-0.16033714 1.82403081]
[ 0.9558137 -1.24398163]]
np.zeros_like()
返回一个形状相同的 全0array
n_clusters = len(set(labels))-(1 if -1 in labels else 0)
set(labels={-1,0,1,2}) 求出labels 种类数
(1 if -1 in labels else 0) 因为有-1 所以不等于0,等于1
colors = plt.cm.Spectral(np.linspace(0,3,len(unique_labels)))
np.linspace(0,1,len(unique_labels))
输出array([ 0. , 0.33333333, 0.66666667, 1. ])
np.linspace(0,3,len(unique_labels))
输出 array([ 0., 1., 2., 3.])
colors 都是输出:
array([[ 0.61960784, 0.00392157, 0.25882353, 1. ],
[ 0.99346405, 0.74771242, 0.43529412, 1. ],
[ 0.74771242, 0.89803922, 0.62745098, 1. ],
[ 0.36862745, 0.30980392, 0.63529412, 1. ]])
plt.cm.Spectral将默认的颜色映射设置为光谱,并应用于当前图像(如果有的话)。
看帮助(色彩映射表)的详细信息
画出簇内点 和离群点
for k,col in zip(unique_labels,colors):
if k == -1:
col = 'k'
class_member_mask = (labels== k)
xy = X[ class_member_mask & core_samples_mask]
plt.plot(xy[:, 0], xy[:, 1], 'o', markerfacecolor=col,markeredgecolor='k', markersize=12)
xy = X[ class_member_mask & ~core_samples_mask]
plt.plot(xy[:, 0],xy[:, 1], 'o', markerfacecolor=col,markeredgecolor='k', markersize=6)
1) zip 函数接受任意多个(包括0个和1个)序列作为参数,返回一个tuple列表:
2)core_samples_mask True False 集合 画出分类点
3)core_samples_mask True False 集合 画出离群点