I use DTW package in R.
and I finally finished hierarchical clustering.
but I wanna plot time-series cluster separately like below picture.
sc <- read.table("D:/handling data/confirm.csv", header=T, sep="," )
rownames(sc) <- sc$STDR_YM_CD
sc$STDR_YM_CD <- NULL
col_n <- colnames(sc)
hc <- hclust(dist(sc), method="average")
plot(hc, main="")
How can I do it??
My data in http://blogattach.naver.com/e772fb415a6c6ddafd1370417f96e494346a9725/20170207_141_blogfile/khm2963_1486442387926_THgZRt_csv/confirm.csv?type=attachment
You can try this:
sc <- read.table("confirm.csv", header=T, sep="," )
rownames(sc) <- sc$STDR_YM_CD
sc$STDR_YM_CD <- NULL
col_n <- colnames(sc)
sc <- t(sc) # make sure your rows represent the time series data
id <- rownames(sc)
head(sc)
hc <- hclust(dist(sc), method="average")
plot(hc, main="")
n <- 20
sc <- cbind.data.frame(id=id, sc, cluster=cutree(hc, k = n))
library(dplyr)
library(tidyr)
library(ggplot2)
sc %>% gather(variable, value, -id, -cluster) %>%
ggplot(aes(variable, value, group=id, color=id)) + geom_line() +
facet_wrap(~cluster, scales = 'free') + guides(color=FALSE) +
theme(axis.text.x = element_text(angle=90, vjust = 0.5))
You can use cutree
to cluster the data points and use facet_wrap
(from package ggplot2
) on clusters to plot them. Since I couldn't get your data, I have an example from publicly available data.
narrest <- USArrests
# Clustering
hc <- hclust(dist(narrest), "ave")
plot(hc)
# Cut the tree to required number of clusters, here 3
narrest$clusters <- cutree(hc, k = 3)
# use facet_wrap from ggplot to one variable Murder
d <- ggplot(narrest, aes(y=Murder, x=1:nrow(narrest))) + geom_line()
d + facet_wrap(~ clusters)
print(d)