I am trying to estimate the pose of a camera by scanning two images taken from it, detecting features in the images, matching them, creating the fundamental matrix, using the camera intrinsics to calculate the essential matrix and then decompose it to find the Rotation and Translation.
Here is the matlab code:
I1 = rgb2gray(imread('1.png'));
I2 = rgb2gray(imread('2.png'));
points1 = detectSURFFeatures(I1);
points2 = detectSURFFeatures(I2);
points1 = points1.selectStrongest(40);
points2 = points2.selectStrongest(40);
[features1, valid_points1] = extractFeatures(I1, points1);
[features2, valid_points2] = extractFeatures(I2, points2);
indexPairs = matchFeatures(features1, features2);
matchedPoints1 = valid_points1(indexPairs(:, 1), :);
matchedPoints2 = valid_points2(indexPairs(:, 2), :);
F = estimateFundamentalMatrix(matchedPoints1,matchedPoints2);
K = [2755.30930612600,0,0;0,2757.82356074384,0;1652.43432833339,1234.09417974414,1];
%figure; showMatchedFeatures(I1, I2, matchedPoints1, matchedPoints2);
E = transpose(K)*F*K;
W = [0,-1,0;1,0,0;0,0,1];
Z = [0,1,0;-1,0,0;0,0,0];
[U,S,V] = svd(E);
R = U*inv(W)*transpose(V);
T = U(:,3);
thetaX = radtodeg(atan2(R(3,2),R(3,3)));
thetaY = radtodeg(atan2(-R(3,1),sqrt(R(3,2)^2 +R(3,3)^2)));
thetaZ = radtodeg(atan2(R(2,1),R(1,1)));
The problem I am facing is that R and T are always incorrect. ThetaZ is most of the times equal to ~90, If I repeat the calculation a lot of times I sometimes get the expected angles. (Only in some cases though)
I dont seem to understand why. It might be because the Fundamental Matrix I calculated is wrong. Or is there a different spot where I am going wrong?
Also what scale/units is T in? (Translation Vector) Or is it inferred differently.
P.S. New to computer vision...
Please note that from decomposing E, 4 solutions are possible (2 possible rotations X 2 possible translations). Specifically regarding R, it can also be: R = UWtranspose(V); Similarly, T can also be: T = -U(:,3);
To check if this is your bug, please post here all the 4 possible solutions for a given case where you get ThetaZ~90.
Another thing I would check (since you have K), is estimating the essential matrix directly (without going through fundamental matrix): http://www.mathworks.com/matlabcentral/fileexchange/47032-camera-geometry-algorithms/content//CV/CameraGeometry/EssentialMatrixFrom2DPoints.m
U and V need to be enforcedto be SO(3). http://mathworld.wolfram.com/SpecialOrthogonalMatrix.html In other words, if U and/or V has a negative determinat, the last column in U and/or V need to be negated. (det(U) < 0) => U(:,3) = -U(:,3)
Best Regards.
Try transposing K. The K that you get from
estimateCameraParameters
assumes row-vectors post-multiplied by a matrix, while the K in most textbooks assumes column-vectors pre-multipied by a matrix.Edit: In the R2015b release of the Computer Vision System Toolbox there is a cameraPose function, which computes relative orientation and location from the fundamental matrix.