I have been researching and studying about partition-based clustering algorithms like K-means and K-Medoids. I have learned that K-medoids is more robust to outliers compared to K-means. However I am curious on what will happen if during the assigning of data points, two or more cluster representatives have the same distance on a data point. Which cluster will you assign the data point? Will the assignment of the data point to a cluster greatly affect the clustering results?
可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
回答1:
To prevent bad things from happening (infinite loops etc.) always prefer the cluster the point already is assigned to when tied.