How to determine world coordinates of a camera?

2019-03-09 10:02发布

问题:

I have a rectangular target of known dimensions and location on a wall, and a mobile camera on a robot. As the robot is driving around the room, I need to locate the target and compute the location of the camera and its pose. As a further twist, the camera's elevation and azimuth can be changed using servos. I am able to locate the target using OpenCV, but I am still fuzzy on calculating the camera's position (actually, I've gotten a flat spot on my forehead from banging my head against a wall for the last week). Here is what I am doing:

  1. Read in previously computed camera intrinsics file
  2. Get the pixel coordinates of the 4 points of the target rectangle from the contour
  3. Call solvePnP with the world coordinates of the rectangle, the pixel coordinates, the camera matrix and the distortion matrix
  4. Call projectPoints with the rotation and translation vectors
  5. ???

I have read the OpenCV book, but I guess I'm just missing something on how to use the projected points, rotation and translation vectors to compute the world coordinates of the camera and its pose (I'm not a math wiz) :-(

2013-04-02 Following the advice from "morynicz", I have written this simple standalone program.

#include <Windows.h>
#include "opencv\cv.h"

using namespace cv;

int main (int argc, char** argv)
{
const char          *calibration_filename = argc >= 2 ? argv [1] : "M1011_camera.xml";
FileStorage         camera_data (calibration_filename, FileStorage::READ);
Mat                 camera_intrinsics, distortion;
vector<Point3d>     world_coords;
vector<Point2d>     pixel_coords;
Mat                 rotation_vector, translation_vector, rotation_matrix, inverted_rotation_matrix, cw_translate;
Mat                 cw_transform = cv::Mat::eye (4, 4, CV_64FC1);


// Read camera data
camera_data ["camera_matrix"] >> camera_intrinsics;
camera_data ["distortion_coefficients"] >> distortion;
camera_data.release ();

// Target rectangle coordinates in feet
world_coords.push_back (Point3d (10.91666666666667, 10.01041666666667, 0));
world_coords.push_back (Point3d (10.91666666666667, 8.34375, 0));
world_coords.push_back (Point3d (16.08333333333334, 8.34375, 0));
world_coords.push_back (Point3d (16.08333333333334, 10.01041666666667, 0));

// Coordinates of rectangle in camera
pixel_coords.push_back (Point2d (284, 204));
pixel_coords.push_back (Point2d (286, 249));
pixel_coords.push_back (Point2d (421, 259));
pixel_coords.push_back (Point2d (416, 216));

// Get vectors for world->camera transform
solvePnP (world_coords, pixel_coords, camera_intrinsics, distortion, rotation_vector, translation_vector, false, 0);
dump_matrix (rotation_vector, String ("Rotation vector"));
dump_matrix (translation_vector, String ("Translation vector"));

// We need inverse of the world->camera transform (camera->world) to calculate
// the camera's location
Rodrigues (rotation_vector, rotation_matrix);
Rodrigues (rotation_matrix.t (), camera_rotation_vector);
Mat t = translation_vector.t ();
camera_translation_vector = -camera_rotation_vector * t;

printf ("Camera position %f, %f, %f\n", camera_translation_vector.at<double>(0), camera_translation_vector.at<double>(1), camera_translation_vector.at<double>(2));
printf ("Camera pose %f, %f, %f\n", camera_rotation_vector.at<double>(0), camera_rotation_vector.at<double>(1), camera_rotation_vector.at<double>(2));
}

The pixel coordinates I used in my test are from a real image that was taken about 27 feet left of the target rectangle (which is 62 inches wide and 20 inches high), at about a 45 degree angle. The output is not what I'm expecting. What am I doing wrong?

Rotation vector
2.7005
0.0328
0.4590

Translation vector
-10.4774
8.1194
13.9423

Camera position -28.293855, 21.926176, 37.650714
Camera pose -2.700470, -0.032770, -0.459009

Will it be a problem if my world coordinates have the Y axis inverted from that of OpenCV's screen Y axis? (the origin of my coordinate system is on the floor to the left of the target, while OpenCV's orgin is the top left of the screen).

What units is the pose in?

回答1:

You get the translation and rotation vectors from solvePnP, which are telling where is the object in camera's coordinates. You need to get an inverse transform.

The transform camera -> object can be written as a matrix [R T;0 1] for homogeneous coordinates. The inverse of this matrix would be, using it's special properties, [R^t -R^t*T;0 1] where R^t is R transposed. You can get R matrix from Rodrigues transform. This way You get the translation vector and rotation matrix for transformation object->camera coordiantes.

If You know where the object lays in the world coordinates You can use the world->object transform * object->camera transform matrix to extract cameras translation and pose.

The pose is described either by single vector or by the R matrix, You surely will find it in Your book. If it's "Learning OpenCV" You will find it on pages 401 - 402 :)

Looking at Your code, You need to do something like this

    cv::Mat R;
    cv::Rodrigues(rotation_vector, R);

    cv::Mat cameraRotationVector;

    cv::Rodrigues(R.t(),cameraRotationVector);

    cv::Mat cameraTranslationVector = -R.t()*translation_vector;

cameraTranslationVector contains camera coordinates. cameraRotationVector contains camera pose.



回答2:

It took me forever to understand it, but the pose meaning is the rotation over each axes - x,y,z. It is in radians. The values are between Pie to minus Pie (-3.14 - 3.14)

Edit: I've might been mistaken. I read that the pose is the vector which indicates the direction of the camera, and the length of the vector indicates how much to rotate the camera around that vector.