How to understand the KITTI camera calibration fil

2020-02-09 03:17发布

问题:

I am working on the KITTI dataset.
I have downloaded the object dataset (left and right) and camera calibration matrices of the object set.

I want to use the stereo information.
But I don't know how to obtain the Intrinsic Matrix and R|T Matrix of the two cameras. And I don't understand what the calibration files mean.

The contents of a calibration file:

P0: 
7.070493000000e+02 0.000000000000e+00 6.040814000000e+02 0.000000000000e+00 
0.000000000000e+00 7.070493000000e+02 1.805066000000e+02 0.000000000000e+00 
0.000000000000e+00 0.000000000000e+00 1.000000000000e+00 0.000000000000e+00
P1: 
7.070493000000e+02 0.000000000000e+00 6.040814000000e+02 -3.797842000000e+02 
0.000000000000e+00 7.070493000000e+02 1.805066000000e+02 0.000000000000e+00 
0.000000000000e+00 0.000000000000e+00 1.000000000000e+00 0.000000000000e+00
P2: 
7.070493000000e+02 0.000000000000e+00 6.040814000000e+02 4.575831000000e+01 
0.000000000000e+00 7.070493000000e+02 1.805066000000e+02 -3.454157000000e-01 
0.000000000000e+00 0.000000000000e+00 1.000000000000e+00 4.981016000000e-03
P3: 
7.070493000000e+02 0.000000000000e+00 6.040814000000e+02 -3.341081000000e+02 
0.000000000000e+00 7.070493000000e+02 1.805066000000e+02 2.330660000000e+00 
0.000000000000e+00 0.000000000000e+00 1.000000000000e+00 3.201153000000e-03
R0_rect: 
9.999128000000e-01 1.009263000000e-02 -8.511932000000e-03 
-1.012729000000e-02 9.999406000000e-01 -4.037671000000e-03 
8.470675000000e-03 4.123522000000e-03 9.999556000000e-01
Tr_velo_to_cam: 
6.927964000000e-03 -9.999722000000e-01 -2.757829000000e-03 -2.457729000000e-02 
-1.162982000000e-03 2.749836000000e-03 -9.999955000000e-01 -6.127237000000e-02 
9.999753000000e-01 6.931141000000e-03 -1.143899000000e-03 -3.321029000000e-01
Tr_imu_to_velo: 
9.999976000000e-01 7.553071000000e-04 -2.035826000000e-03 -8.086759000000e-01 
-7.854027000000e-04 9.998898000000e-01 -1.482298000000e-02 3.195559000000e-01 
2.024406000000e-03 1.482454000000e-02 9.998881000000e-01 -7.997231000000e-01

回答1:

From the README,

The sensor calibration zip archive contains files, storing matrices in row-aligned order, meaning that the first values correspond to the first row:

calib_cam_to_cam.txt: Camera-to-camera calibration


  • S_xx: 1x2 size of image xx before rectification
  • K_xx: 3x3 calibration matrix of camera xx before rectification
  • D_xx: 1x5 distortion vector of camera xx before rectification
  • R_xx: 3x3 rotation matrix of camera xx (extrinsic)
  • T_xx: 3x1 translation vector of camera xx (extrinsic)
  • S_rect_xx: 1x2 size of image xx after rectification
  • R_rect_xx: 3x3 rectifying rotation to make image planes co-planar
  • P_rect_xx: 3x4 projection matrix after rectification

Note: When using this dataset you will most likely need to access only P_rect_xx, as this matrix is valid for the rectified image sequences. maintained



回答2:

see https://medium.com/test-ttile/kitti-3d-object-detection-dataset-d78a762b5a4 "The Px matrices project a point in the rectified referenced camera coordinate to the camera_x image. camera_0 is the reference camera coordinate. R0_rect is the rectifying rotation for reference coordinate ( rectification makes images of multiple cameras lie on the same plan). Tr_velo_to_cam maps a point in point cloud coordinate to reference co-ordinate. Will do 2 tests here. The first test is to project 3D bounding boxes from label file onto image. Second test is to project a point in point cloud coordinate to image. The algebra is simple as follows. The first equation is for projecting the 3D bouding boxes in reference camera co-ordinate to camera_2 image. The second equation projects a velodyne co-ordinate point into the camera_2 image. y_image = P2 * R0_rect * R0_rot * x_ref_coord y_image = P2 * R0_rect * Tr_velo_to_cam * x_velo_coord In the above, R0_rot is the rotation matrix to map from object coordinate to reference coordinate."