How can I get cluster number correspond to data us

I clustered data by k-means clustering method, how can i get cluster number correspond to data using k-means clustering techniques in R? In order to get each record belongs to which cluster.

example 12 32 13 => 1. 12,13 2. 32

标签： r cluster-analysis k-means

3条回答

我只想做你的唯一

2楼-- · 2019-02-07 13:44

@ Java questioner

You can access the cluster data as followed:

> data_clustered <- kmeans(data)
> data_clustered$cluster

data_clustered$cluster is a vector with the length of the original number of records in data. Each entry is for the that row.

To get all the records belonging to cluster 1:

> data$cluster <- data_clustered$cluster 
> data_clus_1 <- data[data$cluster == 1,]

Number of clusters:

> max(data$cluster)

Good luck with your clustering

0人赞添加讨论(0) 举报

何必那么认真

3楼-- · 2019-02-07 13:53

We like reproducible examples here on Stack Overflow. Otherwise we're just guessing.

I'll guess that you are using kmeans in the stats package.

I'll further guess you haven't read the documentation help(kmeans) which says:

Value:

  an object of class 'kmeans' which is a list with components:

   cluster: A vector of integers indicating the cluster to which each point is allocated.

There's an example in the help that shows you exactly how that works.

0人赞添加讨论(0) 举报

手持菜刀，她持情操

4楼-- · 2019-02-07 13:55

It sounds like you are trying to access the cluster vector that is returned by kmeans(). From the help page for cluster:

A vector of integers (from 1:k) indicating the cluster to which each 
point is allocated.

Using the example on the help page:

x <- rbind(matrix(rnorm(100, sd = 0.3), ncol = 2),
           matrix(rnorm(100, mean = 1, sd = 0.3), ncol = 2))
colnames(x) <- c("x", "y")
(cl <- kmeans(x, 2))

#Access the cluster vector
cl$cluster

> cl$cluster
  [1] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 [45] 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 [89] 1 1 1 1 1 1 1 1 1 1 1 1

To address the question in the comments

You can "map" the cluster number to the original data by doing something like this:

out <- cbind(x, clusterNum = cl$cluster)
head(out)

               x          y clusterNum
[1,] -0.42480483 -0.2168085          2
[2,] -0.06272004  0.3641157          2
[3,]  0.08207316  0.2215622          2
[4,] -0.19539844  0.1306106          2
[5,] -0.26429056 -0.3249288          2
[6,]  0.09096253 -0.2158603          2

cbind is the function for column bind, there is also an rbind function for rows. See their help pages for more details ?cbind and ?rbind respectively.

0人赞添加讨论(0) 举报

How can I get cluster number correspond to data us

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间