Google Tango: Aligning Depth and Color Frames

2020-02-29 12:00发布

问题:

I would like to align a (synchronous) depth/color frame pair, using the Google Tango tablet, such that, assuming that both frames have the same resolution, each pixel in the depth frame corresponds to the same pixel in the color frame, i.e., I would like to achieve a retinotopic mapping. How can this be achieved using the latest C API (Hilbert Release Version 1.6)? Any help on this will be greatly appreciated.

回答1:

Generating simple crude UV coordinates to map tango point cloud points back onto source image (texture coordinates) - see comments above for more details, we've messed this thread up but good :-( (Language is C#, classes are .Net) Field of view calculate FOV horizontal (true) or vertical (false)

public PointF PictureUV(Vector3D imagePlaneLocation)
        {
            // u is a function of x where y is 0
            double u = Math.Atan2(imagePlaneLocation.X, imagePlaneLocation.Z);
            u += (FieldOfView(true) / 2.0);
            u = u/FieldOfView(true);
            double v = Math.Atan2(imagePlaneLocation.Y, imagePlaneLocation.Z);
            v += (FieldOfView() / 2.0);
            v = v / FieldOfView();

            return new PointF((float)u, (float)(1.0 - v));
        }


回答2:

Mark, thanks for your quick response. Probably my question was a bit inprecise. You are of course damn right in saying that a retinotopic mapping between a 2D and a 3D image cannot be established. Shame on me. Nonetheless, what I need is a mapping in which all depth samples (x_n,y_n,d_n), 1<=n<=N, N being the number of depth values, correspond to the same pixels (x_n,y_n) in the (synchronized) color frame. It is well taken that the depth sensor cannot provide depth information for troublesome areas in the visual field.



回答3:

One of your conditions is not possible - there is no guarantee that tango will hand you a point cloud measurement of something in the visual field if it has trouble seeing it - also there isn't a 1:1 correspondence between pixels and depth frame as the depth info is 3D



回答4:

I have not tried this but we can probably do: for each (X,Y,Z) from point cloud:

u_pixel = -(X/Z)* Fx, v_pixel = -(Y/Z)* Fy.
x = (u-cx)/Fx, y = (v-cy)/Fy.

for distortion correction (k1,k2,k2 can from distortion[] part of TangoInstrinsics, r = Math.sqrt(x^2 + y^2)))

x_corrected = x * (1 + k1 * r2 + k2 * r4 + k3 * r6) 
y_corrected = y * (1 + k1 * r2 + k2 * r4 + k3 * r6) 

Then we can convert normalized x_corrected, y_corrected to x_raster, y_raster by using reverse of the above formula (x_raster = x_correct*Fx+ cx)