What are some packages that implement semi-supervi

2019-02-07 04:51发布

问题:

I want to run some experiments on semi-supervised (constrained) clustering, in particular with background knowledge provided as instance level pairwise constraints (Must-Link or Cannot-Link constraints). I would like to know if there are any good open-source packages that implement semi-supervised clustering? I tried to look at PyBrain, mlpy, scikit and orange, and I couldn't find any constrained clustering algorithms. In particular, I'm interested in constrained K-Means or constrained density based clustering algorithms (like C-DBSCAN). Packages in Matlab, Python, Java or C++ would be preferred, but need not be limited to these languages.

回答1:

The python package scikit-learn has now algorithms for Ward hierarchical clustering (since 0.15) and agglomerative clustering (since 0.14) that support connectivity constraints.

Besides, I do have a real world application, namely the identification of tracks from cell positions, where each track can only contain one position from each time point.



回答2:

Maybe its a bit late but have a look at the following.

  1. An extension of Weka (in java) that implements PKM, MKM and PKMKM

    http://www.cs.ucdavis.edu/~davidson/constrained-clustering/

  2. Gaussian mixture model using EM and constraints in Matlab

    http://www.scharp.org/thertz/code.html

I hope that this helps.



回答3:

The R package conclust implements a number of algorithms:

There are 4 main functions in this package: ckmeans(), lcvqe(), mpckm() and ccls(). They take an unlabeled dataset and two lists of must-link and cannot-link constraints as input and produce a clustering as output.

There's also an implementation of COP-KMeans in python.