I am currently trying to generate 3D points given stereo image pair in OpenCV. This has been done quite a bit as far as I can search.
I know the extrinsic parameters of the stereo setup which I'm going to assume is in frontal parallel configuration (really, it isn't that bad!). I know the focal length, baseline, and I'm going to assume the principal point as the center of the image (I know, I know...).
I calculate a psuedo-decent disparity map using StereoSGBM and hand coded the Q matrix following O'Reilly's Learning OpenCV book which specifies:
Q = [ 1 0 0 -c_x
0 1 0 -c_y
0 0 0 f
0 0 -1/T_x (c_x - c_x')/T_x ]
I'll take that ( c_x, c_y ) is the principal point (which I specified in image coordinates), f is the focal length (which I described in mm), and T_x is the translation between the two cameras or baseline (which I also described in mm).
int type = CV_STEREO_BM_BASIC;
double rescx = 0.25, rescy = 0.25;
Mat disparity, vdisparity, depthMap;
Mat frame1 = imread( "C:\\Users\\Administrator\\Desktop\\Flow\\IMG137.jpg", CV_LOAD_IMAGE_GRAYSCALE );
Mat frame1L = frame1( Range( 0, frame1.rows ), Range( 0, frame1.cols/2 ));
Mat frame1R = frame1( Range( 0, frame1.rows ), Range( frame1.cols/2, frame1.cols ));
resize( frame1L, frame1L, Size(), rescx, rescy );
resize( frame1R, frame1R, Size(), rescx, rescy );
int preFilterSize = 9, preFilterCap = 32, disparityRange = 4;
int minDisparity = 2, textureThreshold = 12, uniquenessRatio = 3;
int windowSize = 21, smoothP1 = 0, smoothP2 = 0, dispMaxDiff = 32;
int speckleRange = 0, speckleWindowSize = 0;
bool dynamicP = false;
StereoSGBM stereo( minDisparity*-16, disparityRange*16, windowSize,
smoothP1, smoothP2, dispMaxDiff,
preFilterCap, uniquenessRatio,
speckleRange*16, speckleWindowSize, dynamicP );
stereo( frame1L, frame1R, disparity );
double m1[3][3] = { { 46, 0, frame1L.cols/2 }, { 0, 46, frame1L.rows/2 }, { 0, 0, 1 } };
double t1[3] = { 65, 0, 0 };
double q[4][4] = {{ 1, 0, 0, -frame1L.cols/2.0 }, { 0, 1, 0, -frame1L.rows/2.0 }, { 0, 0, 0, 46 }, { 0, 0, -1.0/65, 0 }};
Mat cm1( 3, 3, CV_64F, m1), cm2( 3, 3, CV_64F, m1), T( 3, 1, CV_64F, t1 );
Mat R1, R2, P1, P2;
Mat Q( 4, 4, CV_64F, q );
//stereoRectify( cm1, Mat::zeros( 5, 1, CV_64F ), cm2, Mat::zeros( 5, 1, CV_64F ), frame1L.size(), Mat::eye( 3, 3, CV_64F ), T, R1, R2, P1, P2, Q );
normalize( disparity, vdisparity, 0, 256, NORM_MINMAX );
//convertScaleAbs( disparity, disparity, 1/16.0 );
reprojectImageTo3D( disparity, depthMap, Q, true );
imshow( "Disparity", vdisparity );
imshow( "3D", depthMap );
So I feed the resulting disparity map from StereoSGBM and that Q matrix to get 3D points, which I write out to a ply file.
But the result is this: http://i.stack.imgur.com/7eH9V.png
Fun to look at, but not what I need :(. I read online that it gets better results after dividing the disparity map by 16 and indeed it looked marginally better (it actually looks like there was a camera that took the shot!).
This is my disparity map if you're interested: http://i.stack.imgur.com/lNPkO.png
I understand that without callibration, it's hardly going to look like the best 3d projection, but I was expecting something a bit... better.
Any suggestions?
Under fronto-parrallel assumption, the relation between disparity and 3D depth is:
d = f*T/Z
, whered
is the disparity,f
is the focal length,T
is the baseline andZ
is the 3D depth. If you treat the image center as the principal point, the 3D coordinate system is settled. Then for a pixel(px,py)
, its 3D coordinate(X, Y, Z)
is:X = (px-cx)*Z/f, Y = (py- cy)*Z/f, Z = f*T/d
,where
cx, cy
are the pixel coordinate of image center.Your disparity image seems pretty good and it can generate reasonable 3D point clouds.
A simple disparity browser on github.