可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 4 years ago.
I am looking for a light weight clustering library in java. I don't need 100s of clustering algo in that library just 5 to 7 algo would be fine for me.
I am sure, you are going to ask: "what kind of algo do you need and for what purpose" :). I just need to do classification of my data with the help of clustering. For example K means.
P.S: I know about weka but I don't want to use it as it is not specifically for clustering only.
回答1:
Take a look at org.apache.commons.math4.ml.clustering.KMeansPlusPlusClusterer in Apache's Commons Math library.
回答2:
I would take a look at JUNG. It has a number of clustering algorithms implemented, although I'm not sure if K-means is one of them.
Another option might be to take a look at Knime, an Eclipse based workflow editor. This includes a number of clustering primitives you can use as part of a workflow, including K-means.
回答3:
There are some open-source clustering algorithms in Java available here, available under the GPL. Requires the Java Colt library (for matrices).
http://open.trickl.com/
回答4:
There is also ELKI, an open-source university project similar to WEKA, but with the focus on cluster analysis and outlier detection instead of machine learning algorithms.
It's pretty advanced, uses index structures for efficiency, and has at least a dozen clustering algorithms.
回答5:
If Scala also works for you, then you might want to check this version of KMeans in Scala:
https://github.com/wspringer/kmeans
A related blog post is here:
http://nxt.flotsam.nl/k-means-clustering.html
回答6:
If you want some basic clustering algorithms in Java, you can check my software:
http://www.philippe-fournier-viger.com/spmf/
It offers an implementation of KMeans and a hierarchical clustering algorithm.
The other algorithms offered are for pattern mining. Totally, there are 47 algorithms. But only 2 for clustering. Another thing: there is a simple GUI for launching the algorithms.
回答7:
Apache Mahout implements many clustering algorithms, via Hadoop. It's a little heavy for what you want, but: http://cwiki.apache.org/MAHOUT/syntheticcontroldata.html
Also you might be able to dig out and adapt the user clustering code from Mahout's TreeClusteringRecommender class, which uses clustering for recommender engine purposes.
回答8:
Cytoscape software has several plugins that implement clustering algorithms for networks and numerical data (Nemo, MCODE, clusterMaker, and so on). All plugins are open-source.