Clustering from the cosine similarity values

2020-02-29 13:21发布

I have extracted words from a set of URLs and calculated cosine similarity between each URL's contents.And also I have normalized the values between 0-1(using Min-Max).Now i need to cluster the URLs based on cosine similarity values to find out similar URLs.which clustering algorithm will be most suitable?.Please suggest me a Dynamic clustering method because it will be useful since i could increase number of URL's on demand and also it will be more natural.Please correct me if you feel i'm making the progress in a wrong way.Thanks in anticipation.

标签： url nlp cluster-analysis information-retrieval

1条回答

傲

2楼-- · 2020-02-29 13:24

K-means clustering can be used for online learning, you just need to select the number of clusters a priori. Also, I think you shouldn't normalize your data, because cosine already provides values in the range [0:1]. Your Min-Max normalization could lead to information loss.

0人赞添加讨论(0) 举报

Clustering from the cosine similarity values

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间