How to calculate Silhouette Score of the scipy'

2019-04-12 21:33发布

问题:

I am using scipy.cluster.hierarchy.linkage as a clustering algorithm and pass the result linkage matrix to scipy.cluster.hierarchy.fcluster, to get the flattened clusters, for various thresholds.

I would like to calculate the Silhouette score of the results and compare them to choose the best threshold and prefer not to implement it on my own but use scikit-learn's sklearn.metrics.silhouette_score. How can I rearrange my clustering results as an input to sklearn.metrics.silhouette_score?

回答1:

You don't have to.

Results of fcluster can directly be fed into silhouette_score.

distmatrix1 = scipy.spatial.distance.squareform(distmatrix + distmatrix.T)
ddgm = scipy.cluster.hierarchy.linkage(distmatrix1, method="average")
nodes = scipy.cluster.hierarchy.fcluster(ddgm, 4, criterion="maxclust")
metrics.silhouette_score(distmatrix + distmatrix.T , nodes, metric='euclidean')