openCV c++: Problems working with CvBoost (Adaboos

2019-02-20 00:18发布

问题:

I'm creating an application for classifying humans in images of urban setting.

I train a classifer in following manner:

int main (int argc, char **argv)
{

/* STEP 2. Opening the file */
//1. Declare a structure to keep the data
  CvMLData cvml;
//2. Read the file
  cvml.read_csv ("directory/train_rand.csv");
//3. Indicate which column is the response
  cvml.set_response_idx (0);

/* STEP 3. Splitting the samples */
//1. Select 4000 for the training
  CvTrainTestSplit cvtts (4000, true);
//2. Assign the division to the data
  cvml.set_train_test_split (&cvtts);

  printf ("Training ... ");
/* STEP 4. The training */
//1. Declare the classifier
  CvBoost boost;
//2. Train it with 100 features
  boost.train (&cvml, CvBoostParams (CvBoost::REAL,100, 0, 1, false, 0),
           false);

/* STEP 5. Calculating the testing and training error */
// 1. Declare a couple of vectors to save the predictions of each sample
  std::vector<float> train_responses, test_responses;
// 2. Calculate the training error
  float fl1 = boost.calc_error (&cvml, CV_TRAIN_ERROR, &train_responses);
// 3. Calculate the test error
  float fl2 = boost.calc_error (&cvml, CV_TEST_ERROR, &test_responses);

  cout<<"Error train: "<<fl1<<endl;

  cout<<"Error test: "<<fl2<<endl;

/* STEP 6. Save your classifier */
// Save the trained classifier
  boost.save ("./trained_boost_4000samples-100ftrs.xml", "boost");

  return 0;
}

train_rand.csv is a file where the first column is the category. The rest of the columns are going to be the features of the problem. For example, I could have used three features. Each of them represent the average of red, blue and green per pixel in the image. So my csv file should look like this. Note that in the first column I am using a character, so OpenCV recognizes that as a category.

B,124.34,45.4,12.4
B,64.14,45.23,3.23
B,42.32,125.41,23.8
R,224.4,35.34,163.87
R,14.55,12.423,89.67
...

For my actual problem, I'm using 100 features and 8000 samples. I train the classifier with half of the data and test the with the rest.

After training, I get a test error of around 5% (which is pretty good for only 100 features).

Now I want to use the classifier in new data:

CvBoost boost

boost.load("directory/trained_boost_4000samples-100ftrs.xml");

float x = boost.predict(SampleData,Mat(),Range::all(),false,false);
cout<<x;

I'm running this code over thousands of samples and it always outputs the same value, which is 2. I really don't understand what I am doing wrong here, but even if I trained to classifier in a wrong way, it wouldn't classify 100% of the times in the same way, also, the test error I calculated before shows that the classifier should work fine.

One thing that is bothering me is that SampleData has to have same number of columns as the sample I used to train. The thing is, the data used to train has 100 columns + 1 response, and if I try to run the classifier with only 100 features it throws an exception saying that sizes doesn't match. If I run the classifier with 101 features (which is absolutely arbitrary) it works, but the results doesn't make any sense.

Can anyone help me with this? Thanks in advance!

Regards

回答1:

I managed to get adaBoost working by adapting the code from the SVM documentation. The only trick was ensuring there was enough sample data (>= 11).

From the blog where your code is copied from:

NOTE: For a very strange reason the OpenCV implementation does not work with less than 11 samples.

// Training data
float labels[11] = { 1.0, 1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0, -1.0};
Mat labelsMat(11, 1, CV_32FC1, labels);

float trainingData[11][2] = {
    {501, 10}, {508, 15},
    {255, 10}, {501, 255}, {10, 501}, {10, 501}, {11, 501}, {9, 501}, {10, 502}, {10, 511}, {10, 495} };
Mat trainingDataMat(11, 2, CV_32FC1, trainingData);

// Set up SVM's parameters
CvSVMParams params;
params.svm_type    = CvSVM::C_SVC;
params.kernel_type = CvSVM::LINEAR;
params.term_crit   = cvTermCriteria(CV_TERMCRIT_ITER, 100, 1e-6);

// Train a SVM classifier
CvSVM SVM;
SVM.train(trainingDataMat, labelsMat, Mat(), Mat(), params);

// Train a boost classifier
CvBoost boost;
boost.train(trainingDataMat,
            CV_ROW_SAMPLE,
            labelsMat);

// Test the classifiers
Mat testSample1 = (Mat_<float>(1,2) << 251, 5);
Mat testSample2 = (Mat_<float>(1,2) << 502, 11);

float svmResponse1 = SVM.predict(testSample1);
float svmResponse2 = SVM.predict(testSample2);

float boostResponse1 = boost.predict(testSample1);
float boostResponse2 = boost.predict(testSample2);

std::cout << "SVM:   " << svmResponse1 << " " << svmResponse2 << std::endl;
std::cout << "BOOST: " << boostResponse1 << " " << boostResponse2 << std::endl;

// Output:
//  > SVM:   -1 1
//  > BOOST: -1 1