Estimating an Affine Transform between Two Images

2020-05-29 18:09发布

问题:

I have a sample image:

I apply the affine transform with the following warp matrix:

[[ 1.25  0.    -128  ]
 [ 0.    2.    -192  ]]

and crop a 128x128 part from the result to get an output image:

Now, I want to estimate the warp matrix and crop size/location from just comparing the sample and output image. I detect feature points using SURF, and match them by brute force:

There are many matches, of which I'm keeping the best three (by distance), since that is the number required to estimate the affine transform. I then use those 3 keypoints to estimate the affine transform using getAffineTransform. However, the transform it returns is completely wrong:

-0.00 1.87 -6959230028596648489132997794229911552.00 
0.00 -1.76 -0.00

What am I doing wrong? Source code is below.

Perform affine transform (Python):

"""Apply an affine transform to an image."""
import cv
import sys
import numpy as np
if len(sys.argv) != 10:
    print "usage: %s in.png out.png x1 y1 width height sx sy flip" % __file__
    sys.exit(-1)
source = cv.LoadImage(sys.argv[1])
x1, y1, width, height, sx, sy, flip = map(float, sys.argv[3:])
X, Y = cv.GetSize(source)
Xn, Yn = int(sx*(X-1)), int(sy*(Y-1))
if flip:
    arr = np.array([[-sx, 0, sx*(X-1)-x1], [0, sy, -y1]])
else:
    arr = np.array([[sx, 0, -x1], [0, sy, -y1]])
print arr
warp = cv.fromarray(arr)
cv.ShowImage("source", source)
dest = cv.CreateImage((Xn, Yn), source.depth, source.nChannels)
cv.WarpAffine(source, dest, warp)
cv.SetImageROI(dest, (0, 0, int(width), int(height)))
cv.ShowImage("dest", dest)
cv.SaveImage(sys.argv[2], dest)
cv.WaitKey(0)

Estimate affine transform from two images (C++):

#include <stdio.h>
#include <iostream>
#include <opencv2/core/core.hpp>
#include <opencv2/features2d/features2d.hpp>
#include <opencv2/calib3d/calib3d.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/nonfree/nonfree.hpp>
#include <opencv2/imgproc/imgproc.hpp>

#include <algorithm>

using namespace cv;

void readme();

bool cmpfun(DMatch a, DMatch b) { return a.distance < b.distance; }

/** @function main */
int main( int argc, char** argv )
{
    if( argc != 3 )
    {
        return -1;
    }

    Mat img_1 = imread( argv[1], CV_LOAD_IMAGE_GRAYSCALE );
    Mat img_2 = imread( argv[2], CV_LOAD_IMAGE_GRAYSCALE );

    if( !img_1.data || !img_2.data )
    {
        return -1;
    }

    //-- Step 1: Detect the keypoints using SURF Detector
    int minHessian = 400;

    SurfFeatureDetector detector( minHessian );

    std::vector<KeyPoint> keypoints_1, keypoints_2;

    detector.detect( img_1, keypoints_1 );
    detector.detect( img_2, keypoints_2 );

    //-- Step 2: Calculate descriptors (feature vectors)
    SurfDescriptorExtractor extractor;

    Mat descriptors_1, descriptors_2;

    extractor.compute( img_1, keypoints_1, descriptors_1 );
    extractor.compute( img_2, keypoints_2, descriptors_2 );

    //-- Step 3: Matching descriptor vectors with a brute force matcher
    BFMatcher matcher(NORM_L2, false);
    std::vector< DMatch > matches;
    matcher.match( descriptors_1, descriptors_2, matches );

    double max_dist = 0;
    double min_dist = 100;

    //-- Quick calculation of max and min distances between keypoints
    for( int i = 0; i < descriptors_1.rows; i++ )
    {   double dist = matches[i].distance;
        if( dist < min_dist ) min_dist = dist;
        if( dist > max_dist ) max_dist = dist;
    }
    printf("-- Max dist : %f \n", max_dist );
    printf("-- Min dist : %f \n", min_dist );

    //-- Draw only "good" matches (i.e. whose distance is less than 2*min_dist )
    //-- PS.- radiusMatch can also be used here.
    sort(matches.begin(), matches.end(), cmpfun);
    std::vector< DMatch > good_matches;
    vector<Point2f> match1, match2;
    for (int i = 0; i < 3; ++i)
    {
        good_matches.push_back( matches[i]);
        Point2f pt1 = keypoints_1[matches[i].queryIdx].pt;
        Point2f pt2 = keypoints_2[matches[i].trainIdx].pt;
        match1.push_back(pt1);
        match2.push_back(pt2);
        printf("%3d pt1: (%.2f, %.2f) pt2: (%.2f, %.2f)\n", i, pt1.x, pt1.y, pt2.x, pt2.y);
    }

    //-- Draw matches
    Mat img_matches;
    drawMatches( img_1, keypoints_1, img_2, keypoints_2, good_matches, img_matches,
                 Scalar::all(-1), Scalar::all(-1), vector<char>(), DrawMatchesFlags::NOT_DRAW_SINGLE_POINTS);

    //-- Show detected matches
    imshow("Matches", img_matches );
    imwrite("matches.png", img_matches);

    waitKey(0);

    Mat fun = getAffineTransform(match1, match2);
    for (int i = 0; i < fun.rows; ++i)
    {
        for (int j = 0; j < fun.cols; j++)
        {
            printf("%.2f ", fun.at<float>(i,j));
        }
        printf("\n");
    }

    return 0;
}

/** @function readme */
void readme()
{
    std::cout << " Usage: ./SURF_descriptor <img1> <img2>" << std::endl;
}

回答1:

The cv::Mat getAffineTransform returns is made of doubles, not of floats. The matrix you get probably is fine, you just have to change the printf command in your loops to

printf("%.2f ", fun.at<double>(i,j));

or even easier: Replace this manual output with

std::cout << fun << std::endl;

It's shorter and you don't have to care about data types yourself.