Input matrix to opencv kmeans clustering

2019-01-17 18:46发布

This question is specific to opencv: The kmeans example given in the opencv documentation has a 2-channel matrix - one channel for each dimension of the feature vector. But, some of the other example seem to say that it should be a one channel matrix with features along the columns with one row for each sample. Which of these is right?

if I have a 5 dimensional feature vector, what should be the input matrix that I use: This one:

cv::Mat inputSamples(numSamples, 1, CV32FC(numFeatures))

or this one:

cv::Mat inputSamples(numSamples, numFeatures, CV_32F)

2条回答
SAY GOODBYE
2楼-- · 2019-01-17 19:05

The correct answer is cv::Mat inputSamples(numSamples, numFeatures, CV_32F). The OpenCV Documentation about kmeans says:

samples – Floating-point matrix of input samples, one row per sample

So it is not a Floating-point vector of n-Dimensional floats as in the other option. Which examples suggested such a behaviour?

Here is also a small example by me that shows how kmeans can be used. It clusters the pixels of an image and displays the result:

#include "opencv2/imgproc/imgproc.hpp"
#include "opencv2/highgui/highgui.hpp"

using namespace cv;

int main( int argc, char** argv )
{
  Mat src = imread( argv[1], 1 );
  Mat samples(src.rows * src.cols, 3, CV_32F);
  for( int y = 0; y < src.rows; y++ )
    for( int x = 0; x < src.cols; x++ )
      for( int z = 0; z < 3; z++)
        samples.at<float>(y + x*src.rows, z) = src.at<Vec3b>(y,x)[z];


  int clusterCount = 15;
  Mat labels;
  int attempts = 5;
  Mat centers;
  kmeans(samples, clusterCount, labels, TermCriteria(CV_TERMCRIT_ITER|CV_TERMCRIT_EPS, 10000, 0.0001), attempts, KMEANS_PP_CENTERS, centers );


  Mat new_image( src.size(), src.type() );
  for( int y = 0; y < src.rows; y++ )
    for( int x = 0; x < src.cols; x++ )
    { 
      int cluster_idx = labels.at<int>(y + x*src.rows,0);
      new_image.at<Vec3b>(y,x)[0] = centers.at<float>(cluster_idx, 0);
      new_image.at<Vec3b>(y,x)[1] = centers.at<float>(cluster_idx, 1);
      new_image.at<Vec3b>(y,x)[2] = centers.at<float>(cluster_idx, 2);
    }
  imshow( "clustered image", new_image );
  waitKey( 0 );
}
查看更多
放荡不羁爱自由
3楼-- · 2019-01-17 19:08

As alternative to reshaping the input matrix manually, you can use OpenCV reshape function to achieve similar result with less code. Here is my working implementation of reducing colors count with K-Means method (in Java):

private final static int MAX_ITER = 10;
private final static int CLUSTERS = 16;

public static Mat colorMapKMeans(Mat img, int K, int maxIterations) {

    Mat m = img.reshape(1, img.rows() * img.cols());
    m.convertTo(m, CvType.CV_32F);

    Mat bestLabels = new Mat(m.rows(), 1, CvType.CV_8U);
    Mat centroids = new Mat(K, 1, CvType.CV_32F);
    Core.kmeans(m, K, bestLabels, 
                new TermCriteria(TermCriteria.COUNT | TermCriteria.EPS, maxIterations, 1E-5),
                1, Core.KMEANS_RANDOM_CENTERS, centroids);
    List<Integer> idx = new ArrayList<>(m.rows());
    Converters.Mat_to_vector_int(bestLabels, idx);

    Mat imgMapped = new Mat(m.size(), m.type());
    for(int i = 0; i < idx.size(); i++) {
        Mat row = imgMapped.row(i);
        centroids.row(idx.get(i)).copyTo(row);
    }

    return imgMapped.reshape(3, img.rows());
}

public static void main(String[] args) {
    System.loadLibrary(Core.NATIVE_LIBRARY_NAME);
    Highgui.imwrite("result.png", 
        colorMapKMeans(Highgui.imread(args[0], Highgui.CV_LOAD_IMAGE_COLOR),
            CLUSTERS, MAX_ITER));
}

OpenCV reads image into 2 dimensional, 3 channel matrix. First call to reshape - img.reshape(1, img.rows() * img.cols()); - essentially unrolls 3 channels into columns. In resulting matrix one row corresponds to one pixel of the input image, and 3 columns corresponds to RGB components.

After K-Means algorithm finished its work, and color mapping has been applied, we call reshape again - imgMapped.reshape(3, img.rows()), but now rolling columns back into channels, and reducing row numbers to the original image row number, thus getting back the original matrix format, but only with reduced colors.

查看更多
登录 后发表回答