I am using a stereo system and so I am trying to get world coordinates of some points by triangulation.
My cameras present an angle, the Z axis direction (direction of the depth) is not normal to my surface. That is why when I observe flat surface, I get no constant depth but a "linear" variation, correct? And I want the depth from the baseline direction... How I can re-project?
A piece of my code with my projective arrays and triangulate function :
#C1 and C2 are the cameras matrix (left and rig)
#R_0 and T_0 are the transformation between cameras
#Coord1 and Coord2 are the correspondant coordinates of left and right respectively
P1 = np.dot(C1,np.hstack((np.identity(3),np.zeros((3,1)))))
P2 =np.dot(C2,np.hstack(((R_0),T_0)))
for i in range(Coord1.shape[0])
z = cv2.triangulatePoints(P1, P2, Coord1[i,],Coord2[i,])
-------- EDIT LATER -----------
Thanks scribbleink, so i tried to apply your proposal. But i think i have a mistake because it doesnt work well as you can see below. And the point clouds seems to be warped and curved towards the edges of the image.
U, S, Vt = linalg.svd(F)
V = Vt.T
#Right epipol
U[:,2]/U[2,2]
# The expected X-direction with C1 camera matri and C1[0,0] the focal length
vecteurX = np.array([(U[:,2]/U[2,2])[0],(U[:,2]/U[2,2])[1],C1[0,0]])
vecteurX_unit = vecteurX/np.sqrt(vecteurX[0]**2 + vecteurX[1]**2 + vecteurX[2]**2)
# The expected Y axis :
height = 2048
vecteurY = np.array([0, height -1, 0])
vecteurY_unit = vecteurY/np.sqrt(vecteurY[0]**2 + vecteurY[1]**2 + vecteurY[2]**2)
# The expected Z direction :
vecteurZ = np.cross(vecteurX,vecteurY)
vecteurZ_unit = vecteurZ/np.sqrt(vecteurZ[0]**2 + vecteurZ[1]**2 + vecteurZ[2]**2)
#Normal of the Z optical (the current Z direction)
Zopitcal = np.array([0,0,1])
cos_theta = np.arccos(np.dot(vecteurZ_unit, Zopitcal)/np.sqrt(vecteurZ_unit[0]**2 + vecteurZ_unit[1]**2 + vecteurZ_unit[2]**2)*np.sqrt(Zopitcal[0]**2 + Zopitcal[1]**2 + Zopitcal[2]**2))
sin_theta = (np.cross(vecteurZ_unit, Zopitcal))[1]
#Definition of the Rodrigues vector and use of cv2.Rodrigues to get rotation matrix
v1 = Zopitcal
v2 = vecteurZ_unit
v_rodrigues = v1*cos_theta + (np.cross(v2,v1))*sin_theta + v2*(np.cross(v2,v1))*(1. - cos_theta)
R = cv2.Rodrigues(v_rodrigues)[0]
Your expected z direction is arbitrary to the reconstruction method. In general, you have a rotation matrix that rotates the left camera from your desired direction. You can easily build that matrix, R. Then all you need to do is to multiply your reconstructed points by the transpose of R.
To add to fireant's response, here is one candidate solution, assuming that the expected X-direction coincides with the line joining the centers of projection of the two cameras.
- Compute the focal lengths f_1 and f_2 (via pinhole model calibration).
- Solve for the location of camera 2's epipole in camera 1's frame. For this, you can use either the Fundamental matrix (F) or the Essential matrix (E) of the stereo camera pair. Specifically, the left and right epipoles lie in the nullspace of F, so you can use Singular Value Decomposition. For a solid theoretical reference, see Hartley and Zisserman, Second edition, Table 9.1 "Summary of fundamental matrix properties" on Page 246 (freely available PDF of the chapter).
- The center of projection of camera 1, i.e. (0, 0, 0) and the location of the right epipole, i.e. (e_x, e_y, f_1) together define a ray that aligns with the line joining the camera centers. This can be used as the expected X-direction. Call this vector v_x.
- Assuming that the expected Y axis faces downward in the image plane, i.e, from (0, 0, f_1) to (0, height-1, f_1), where f is the focal length. Call this vector as v_y.
- The expected Z direction is now the cross-product of vectors v_x and v_y.
- Using the expected Z direction along with the optical axis (Z-axis) of camera 1, you can then compute a rotation matrix from two 3D vectors using, say the method listed in this other stackoverflow post.
Practical note:
Expecting the planar object to exactly align with the stereo baseline is unlikely without considerable effort, in my practical experience. Some amount of plane-fitting and additional rotation would be required.
One-time effort:
It depends on whether you need to do this once, e.g. for one-time calibration, in which case simply make this estimation process real-time, then rotate your stereo camera pair until the depth map variance is minimized. Then lock your camera positions and pray someone doesn't bump into it later.
Repeatability:
If you need to keep aligning your estimated depth maps to truly arbitrary Z-axes that change for every new frame captured, then you should consider investing time in the plane-estimation method and making it more robust.