Which clustering algorithm is suitable for one-dim

2019-05-27 04:14发布

I have a one dimensional List like this

public class Zeit_und_Eigenschaft
{
    [Feature]
    public double Sekunden { get; set; }
}

//...
List<Zeit_und_Eigenschaft> lzue = new List<Zeit_und_Eigenschaft>();
//fill lzue

lzue can be

lzue.Sekunden
1
2
3
4
8
9
10
22
55
...

Goal is to find clusters in that list, ie elements that could form groups like f.i. in this example

lzue.Sekunden
1
2
3
4

8
9
10

22

55

Which clustering algorithm is suitable(I don't know the number of clusters k)? GMM? PCA? Kmeans? Other?

2条回答
Bombasti
2楼-- · 2019-05-27 04:43

Don't look for clustering algorithms.

Clustering is a good term for multivariate data, but your data is one-dimensional, so you should look at much older statistics literature. E.g. Natural Breaks optimization.

Or just kernel density estimation. In fact, you will find the very same question dozens of times here on stackoverflow already...

1D Number Array Clustering

Cluster one-dimensional data optimally?

partitioning an float array into similar segments (clustering)

Efficiently grouping similar numbers together

Clustering values by their proximity in python (machine learning?)

查看更多
forever°为你锁心
3楼-- · 2019-05-27 04:58

There was a good article in MSDN magazine on this topic a few months ago. They used the k-means algorithm. Link:

http://msdn.microsoft.com/en-us/magazine/jj891054.aspx

Also, there are some videos on k-means clustering as part of Andrew Ng's online machine learning class. Link:

https://class.coursera.org/ml-003/lecture/preview

When you don't know k, there are some algorithms to search for a good value. Do a web search for k-means + elbow.

查看更多
登录 后发表回答