How do I reverse-project 2D points into 3D?

2019-01-07 03:23发布

I have 4 2D points in screen-space, and I need to reverse-project them back into 3D space. I know that each of the 4 points is a corner of a 3D-rotated rigid rectangle, and I know the size of the rectangle. How can I get 3D coordinates from this?

I am not using any particular API, and I do not have an existing projection matrix. I'm just looking for basic math to do this. Of course there isn't enough data to convert a single 2D point to 3D with no other reference, but I imagine that if you have 4 points, you know that they're all at right-angles to each other on the same plane, and you know the distance between them, you should be able to figure it out from there. Unfortunately I can't quite work out how though.

This might fall under the umbrella of photogrammetry, but google searches for that haven't led me to any helpful information.

14条回答
神经病院院长
2楼-- · 2019-01-07 03:52

If you know the shape is a rectangle in a plane, you can greatly further constrain the problem. You certainly cannot figure out "which" plane, so you can choose that it is lying on the plane where z=0 and one of the corners is at x=y=0, and the edges are parallel to the x/y axis.

The points in 3d are therefore {0,0,0},{w,0,0},{w,h,0},and {0,h,0}. I'm pretty certain the absolute size will not be found, so only the ratio w/h is releavant, so this is one unknown.

Relative to this plane the camera must be at some point cx,cy,cz in space, must be pointing in a direction nx,ny,nz (a vector of length one so one of these is redundant), and have a focal_length/image_width factor of w. These numbers turn into a 3x3 projection matrix.

That gives a total of 7 unknowns: w/h, cx, cy, cz, nx, ny, and w.

You have a total of 8 knowns: the 4 x+y pairs.

So this can be solved.

Next step is to use Matlab or Mathmatica.

查看更多
Ridiculous、
3楼-- · 2019-01-07 03:53

This is the Classic problem for marker based Augmented Reality.

You have a square marker (2D Barcode), and you want to find its Pose (translation & rotation in relation to the camera), after finding the four edges of the marker. Overview-Picture

I'm not aware of the latest contributions to the field, but at least up to a point (2009) RPP was supposed to outperform POSIT that is mentioned above (and is indeed a classic approach for this) Please see the links, they also provide source.

(PS - I know it's a bit old topic, but anyway, the post might be helpful to somebody)

查看更多
我命由我不由天
5楼-- · 2019-01-07 03:53

I'll get my linear Algebra book out when I get home if nobody answered. But @ D G, not all matrices are invertible. Singular matrices aren't invertible (when determinant = 0). This will actually happen all the time, since a projection matrix must have eigenvalues of 0 and 1, and be square (since it is idempotent, so p^2 = p).

An easy example is, [[0 1][0 1]] since the determinant = 0, and that is a projection on the line x = y!

查看更多
一夜七次
6楼-- · 2019-01-07 03:56

Thanks to @Vegard for an excellent answer. I cleaned up the code a little bit:

import pandas as pd
import numpy as np

class Point2:
    def __init__(self,x,y):
        self.x = x
        self.y = y

class Point3:
    def __init__(self,x,y,z):
        self.x = x
        self.y = y
        self.z = z

# Known 2D coordinates of our rectangle
i0 = Point2(318, 247)
i1 = Point2(326, 312)
i2 = Point2(418, 241)
i3 = Point2(452, 303)

# 3D coordinates corresponding to i0, i1, i2, i3
r0 = Point3(0, 0, 0)
r1 = Point3(0, 0, 1)
r2 = Point3(1, 0, 0)
r3 = Point3(1, 0, 1)

mat = [
    [1, 0, 0, 0],
    [0, 1, 0, 0],
    [0, 0, 1, 0],
    [0, 0, 0, 1],
]

def project(p, mat):
    #print mat
    x = mat[0][0] * p.x + mat[0][1] * p.y + mat[0][2] * p.z + mat[0][3] * 1
    y = mat[1][0] * p.x + mat[1][1] * p.y + mat[1][2] * p.z + mat[1][3] * 1
    w = mat[3][0] * p.x + mat[3][1] * p.y + mat[3][2] * p.z + mat[3][3] * 1
    return Point2(720 * (x / w + 1) / 2., 576 - 576 * (y / w + 1) / 2.)

# The squared distance between two points a and b
def norm2(a, b):
    dx = b.x - a.x
    dy = b.y - a.y
    return dx * dx + dy * dy

def evaluate(mat): 
    c0 = project(r0, mat)
    c1 = project(r1, mat)
    c2 = project(r2, mat)
    c3 = project(r3, mat)
    return norm2(i0, c0) + norm2(i1, c1) + norm2(i2, c2) + norm2(i3, c3)    

def perturb(mat, amount):
    from copy import deepcopy
    from random import randrange, uniform
    mat2 = deepcopy(mat)
    mat2[randrange(4)][randrange(4)] += uniform(-amount, amount)
    return mat2

def approximate(mat, amount, n=1000):
    est = evaluate(mat)
    for i in xrange(n):
        mat2 = perturb(mat, amount)
        est2 = evaluate(mat2)
        if est2 < est:
            mat = mat2
            est = est2

    return mat, est

for i in xrange(1000):
    mat,est = approximate(mat, 1)
    print mat
    print est

The approximate call with .1 did not work for me, so I took it out. I ran it for a while too, and last I checked it was at

[[0.7576315397559887, 0, 0.11439449272592839, -0.314856490473439], 
[0.06440497208710227, 1, -0.5607502645413118, 0.38338196981556827], 
[0, 0, 1, 0], 
[0.05421620936883742, 0, -0.5673977598434641, 2.693116299312736]]

with an error around 0.02.

查看更多
乱世女痞
7楼-- · 2019-01-07 04:03

For my OpenGL engine, the following snip will convert mouse/screen coordinates into 3D world coordinates. Read the commments for an actual description of what is going on.

/*   FUNCTION:        YCamera :: CalculateWorldCoordinates
     ARGUMENTS:       x         mouse x coordinate
                      y         mouse y coordinate
                      vec       where to store coordinates
     RETURN:          n/a
     DESCRIPTION:     Convert mouse coordinates into world coordinates
*/

void YCamera :: CalculateWorldCoordinates(float x, float y, YVector3 *vec) { // START GLint viewport[4]; GLdouble mvmatrix[16], projmatrix[16];

GLint real_y;
GLdouble mx, my, mz;

glGetIntegerv(GL_VIEWPORT, viewport);
glGetDoublev(GL_MODELVIEW_MATRIX, mvmatrix);
glGetDoublev(GL_PROJECTION_MATRIX, projmatrix);

real_y = viewport[3] - (GLint) y - 1;   // viewport[3] is height of window in pixels
gluUnProject((GLdouble) x, (GLdouble) real_y, 1.0, mvmatrix, projmatrix, viewport, &mx, &my, &mz);

/*  'mouse' is the point where mouse projection reaches FAR_PLANE.
    World coordinates is intersection of line(camera->mouse) with plane(z=0) (see LaMothe 306)

    Equation of line in 3D:
        (x-x0)/a = (y-y0)/b = (z-z0)/c      

    Intersection of line with plane:
        z = 0
        x-x0 = a(z-z0)/c  <=> x = x0+a(0-z0)/c  <=> x = x0 -a*z0/c
        y = y0 - b*z0/c

*/
double lx = fPosition.x - mx;
double ly = fPosition.y - my;
double lz = fPosition.z - mz;
double sum = lx*lx + ly*ly + lz*lz;
double normal = sqrt(sum);
double z0_c = fPosition.z / (lz/normal);

vec->x = (float) (fPosition.x - (lx/normal)*z0_c);
vec->y = (float) (fPosition.y - (ly/normal)*z0_c);
vec->z = 0.0f;

}

查看更多
登录 后发表回答