How can I move the camera correctly in 3D space?

2020-03-02 07:45发布

问题:

What I want to do:

I am trying to figure out how to make the camera work like this:

  • Mouse movement: camera rotates
  • Up/Down key: camera moves forward/backwards; forward means the direction the camera is facing
  • Left/Right key: camera moves sideways
  • Q/E key: camera moves up and down

Since I have a lot of code, I will do my best to try to explain how I did it, without too much code. The project I'm working on is very large, and has a pretty big library with many classes and types that would make it hard to understand.

The problem

I have managed to almost get this working, however after moving around a little, at some angles, things start failing: when pressing Up, the camera moves sideways and so on.

The algorithm I thought of is explained in detail below.

The question is, am I doing things wrong? What could make it fail? I tried debugging this camera the entire day, and haven't figured out what makes it fail.

Clarifications

  • This is how I understood rotation: a 3D vector (maybe improperly called vector), where each component means the axis around which the object rotates. For example, the X value would be how much the object rotates around the X axis. Because I am working in OpenGL, rotation values will be in degrees (not radians).

  • When rendering the camera, I simply translate the camera position, but with opposite sign.

Same applies for rotation:

glRotatef(-currentCamera->Rotation().X, 1.0f, 0, 0);
glRotatef(-currentCamera->Rotation().Y, 0, 1.0f, 0);
glRotatef(-currentCamera->Rotation().Z, 0, 0, 1.0f);
glTranslatef(-currentCamera->Position().X, -currentCamera->Position().Y, -currentCamera->Position().Z);

What I tried (and didn't work):

I tried using simple geometry and mathematics, using Pythagoras theorem, and simple trigonometry, but it failed miserably, so I stopped trying to get this working. (e.g. NaN result if any of the rotation coordinates was 0).

What I tried (and did work... almost):

Using transformation matrices.

When the user presses any of those keys, a 3d vector is generated:

+X = right; -X = left
+Y = top; -Y = bottom
+Z = backward (towards camera); -Z = forward (away from camera)

Next, I generate a transformation matrix: the identity (4x4 matrix) is multiplied by the rotation matrix 3 times, for each of the 3 coordinates (X then Y then Z). Next, I apply the matrix to the vector I created, and I add the result to the old position of the camera.

However, there seems to be a problem with this approach. At first it works just fine, but after a while, when I press Up it goes sideways instead of the way it should.

Actual code

As I stated above, I tried to use as little code as possible. However if this is not helpful enough, here is some actual code. I did my best to select only the most relevant code.

// ... Many headers

// 'Camera' is a class, which, among other things, it has (things relevant here):
// * Position() getter, SetPosition() setter
// * Rotation() getter, SetRotation() setter

// The position and rotation are stored in another class (template), 'Vector3D <typename T>',
// which has X, Y and Z values. It also implements a '+' operator.

float angle; // this is for animating our little cubes
Camera* currentCamera;

// 'Matrix' is a template, which contains a 4x4 array of a generic type, which is public and
// called M. It also implements addition/subtraction operators, and multiplication. The 
// constructor memset's the array to 0.

// Generates a matrix with 1.0 on the main diagonal
Matrix<float> IdentityMatrix()
{
    Matrix<float> res;

    for (int i = 0; i < 4; i++)
        res.M[i][i] = 1.0f;

    return res;
}

// I used the OpenGL documentation about glRotate() to write this
Matrix<float> RotationMatrix (float angle, float x, float y, float z)
{
    Matrix<float> res;

    // Normalize; x, y and z must be smaller than 1
    if (abs(x) > 1 || abs(y) > 1 || abs(z) > 1)
    {
        // My own implementation of max which allows 3 parameters
        float M = Math::Max(abs(x), abs(y), abs(z)); 
        x /= M; y /= M; z /= M;
    }

    // Vars
    float s = Math::SinD(angle); // SinD and CosD convert the angle to degrees
    float c = Math::CosD(angle); // before calling the standard library sin and cos

    // Vector
    res.M[0][0] = x * x * (1 - c) + c;
    res.M[0][1] = x * y * (1 - c) - z * s;
    res.M[0][2] = x * z * (1 - c) + y * s;
    res.M[1][0] = y * x * (1 - c) + z * s;
    res.M[1][1] = y * y * (1 - c) + c;
    res.M[1][2] = y * z * (1 - c) - x * s;
    res.M[2][0] = x * z * (1 - c) - y * s;
    res.M[2][1] = y * z * (1 - c) + x * s;
    res.M[2][2] = z * z * (1 - c) + c;
    res.M[3][3] = 1.0f;

    return res;
}

// Used wikipedia for this one :)
Matrix<float> TranslationMatrix (float x, float y, float z)
{
    Matrix<float> res = IdentityMatrix();

    res.M[0][3] = x;
    res.M[1][3] = y;
    res.M[2][3] = z;

    return res;
}

Vector3D<float> ApplyMatrix (Vector3D<float> v, const Matrix<float>& m)
{
    Vector3D<float> res;

    res.X = m.M[0][0] * v.X + m.M[0][1] * v.Y + m.M[0][2] * v.Z + m.M[0][3];
    res.Y = m.M[1][0] * v.X + m.M[1][1] * v.Y + m.M[1][2] * v.Z + m.M[1][3];
    res.Z = m.M[2][0] * v.X + m.M[2][1] * v.Y + m.M[2][2] * v.Z + m.M[2][3];

    return res;
}

// Vector3D instead of x, y and z 
inline Matrix<float> RotationMatrix (float angle, Vector3D<float> v)
{
    return RotationMatrix (angle, v.X, v.Y, v.Z);
}

inline Matrix<float> TranslationMatrix (Vector3D<float> v)
{
    return TranslationMatrix (v.X, v.Y, v.Z);
}

inline Matrix<float> ScaleMatrix (Vector3D<float> v)
{
    return ScaleMatrix (v.X, v.Y, v.Z);
}


// This gets called after everything is initialized (SDL, OpenGL etc)
void OnStart()
{
    currentCamera = new Camera("camera0");
    angle = 0;
    SDL_ShowCursor(0); // Hide cursor
}

// This gets called periodically
void OnLogicUpdate()
{
    float delta = .02; // How much we move
    Vector3D<float> rot = currentCamera->Rotation();
    Vector3D<float> tr (0, 0, 0);

    Uint8* keys = SDL_GetKeyState(0);

    // Cube animation
    angle += 0.05;

    // Handle keyboard stuff
    if (keys[SDLK_LSHIFT] || keys[SDLK_RSHIFT]) delta = 0.1;
    if (keys[SDLK_LCTRL] || keys[SDLK_RCTRL]) delta = 0.008;

    if (keys[SDLK_UP] || keys[SDLK_w]) tr.Z += -delta;
    if (keys[SDLK_DOWN] || keys[SDLK_s]) tr.Z += delta;
    if (keys[SDLK_LEFT] || keys[SDLK_a]) tr.X += -delta;
    if (keys[SDLK_RIGHT] || keys[SDLK_d]) tr.X += delta;

    if (keys[SDLK_e]) tr.Y += -delta;
    if (keys[SDLK_q]) tr.Y += delta;

    if (tr != Vector3D<float>(0.0f, 0.0f, 0.0f))
    {
        Math::Matrix<float> r = Math::IdentityMatrix();
        r *= Math::RotationMatrix(rot.X, 1.0f, 0, 0);
        r *= Math::RotationMatrix(rot.Y, 0, 1.0f, 0);
        r *= Math::RotationMatrix(rot.Z, 0, 0, 1.0f);

        Vector3D<float> new_pos = Math::ApplyMatrix(tr, r);
        currentCamera->SetPosition(currentCamera->Position() + new_pos);
    }
}

// Event handler, handles mouse movement and ESCAPE exit
void OnEvent(SDL_Event* e)
{
    const float factor = -.1f;

    if (e->type == SDL_MOUSEMOTION)
    {
        // Is mouse in the center? If it is, we just moved it there, ignore
        if (e->motion.x == surface->w / 2 && e->motion.y == surface->h / 2)
            return;

        // Get delta
        float dx = e->motion.xrel;
        float dy = e->motion.yrel;

        // Make change
        currentCamera->SetRotation(currentCamera->Rotation() + World::Vector3D<float>(dy * factor, dx * factor, 0));

        // Move back to center
        SDL_WarpMouse(surface->w / 2, surface->h / 2);

    }

    else if (e->type == SDL_KEYUP)
    switch (e->key.keysym.sym)
    {
        case SDLK_ESCAPE:
            Debug::Log("Escape key pressed, will exit.");
            StopMainLoop(); // This tells the main loop to stop
            break;

        default: break;
    }
}

// Draws a cube in 'origin', and rotated at angle 'angl'
void DrawCube (World::Vector3D<float> origin, float angl)
{
    glPushMatrix();
    glTranslatef(origin.X, origin.Y, origin.Z);
    glRotatef(angl, 0.5f, 0.2f, 0.1f);

    glBegin(GL_QUADS);
        glColor3f(0.0f,1.0f,0.0f);          // green
        glVertex3f( 1.0f, 1.0f,-1.0f);          // Top Right Of The Quad (Top)
        glVertex3f(-1.0f, 1.0f,-1.0f);          // Top Left Of The Quad (Top)
        glVertex3f(-1.0f, 1.0f, 1.0f);          // Bottom Left Of The Quad (Top)
        glVertex3f( 1.0f, 1.0f, 1.0f);          // Bottom Right Of The Quad (Top)

        glColor3f(1.0f,0.5f,0.0f);          // orange
        glVertex3f( 1.0f,-1.0f, 1.0f);          // Top Right Of The Quad (Bottom)
        glVertex3f(-1.0f,-1.0f, 1.0f);          // Top Left Of The Quad (Bottom)
        glVertex3f(-1.0f,-1.0f,-1.0f);          // Bottom Left Of The Quad (Bottom)
        glVertex3f( 1.0f,-1.0f,-1.0f);          // Bottom Right Of The Quad (Bottom)

        glColor3f(1.0f,0.0f,0.0f);          // red
        glVertex3f( 1.0f, 1.0f, 1.0f);          // Top Right Of The Quad (Front)
        glVertex3f(-1.0f, 1.0f, 1.0f);          // Top Left Of The Quad (Front)
        glVertex3f(-1.0f,-1.0f, 1.0f);          // Bottom Left Of The Quad (Front)
        glVertex3f( 1.0f,-1.0f, 1.0f);          // Bottom Right Of The Quad (Front)

        glColor3f(1.0f,1.0f,0.0f);          // yellow
        glVertex3f( 1.0f,-1.0f,-1.0f);          // Bottom Left Of The Quad (Back)
        glVertex3f(-1.0f,-1.0f,-1.0f);          // Bottom Right Of The Quad (Back)
        glVertex3f(-1.0f, 1.0f,-1.0f);          // Top Right Of The Quad (Back)
        glVertex3f( 1.0f, 1.0f,-1.0f);          // Top Left Of The Quad (Back)

        glColor3f(0.0f,0.0f,1.0f);          // blue
        glVertex3f(-1.0f, 1.0f, 1.0f);          // Top Right Of The Quad (Left)
        glVertex3f(-1.0f, 1.0f,-1.0f);          // Top Left Of The Quad (Left)
        glVertex3f(-1.0f,-1.0f,-1.0f);          // Bottom Left Of The Quad (Left)
        glVertex3f(-1.0f,-1.0f, 1.0f);          // Bottom Right Of The Quad (Left)

            glColor3f(1.0f,0.0f,1.0f);          // violet
            glVertex3f( 1.0f, 1.0f,-1.0f);          // Top Right Of The Quad (Right)
            glVertex3f( 1.0f, 1.0f, 1.0f);          // Top Left Of The Quad (Right)
            glVertex3f( 1.0f,-1.0f, 1.0f);          // Bottom Left Of The Quad (Right)
            glVertex3f( 1.0f,-1.0f,-1.0f);          // Bottom Right Of The Quad (Right)

    glEnd();

    glPopMatrix();
}

// Gets called periodically
void OnRender()
{
    glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
    glLoadIdentity();

    // Camera movement
    glRotatef(-currentCamera->Rotation().X, 1.0f, 0, 0);
    glRotatef(-currentCamera->Rotation().Y, 0, 1.0f, 0);
    glRotatef(-currentCamera->Rotation().Z, 0, 0, 1.0f);
    glTranslatef(-currentCamera->Position().X, -currentCamera->Position().Y, -currentCamera->Position().Z);

    // Draw some cubes
    for (float i = -5; i <= 5; i++)
        for (float j = -5; j <= 5; j++)
        {
            DrawCube(World::Vector3D<float>(i*3, j * 3, -5), angle + 5 * i + 5 * j);
        }

    SDL_GL_SwapBuffers();
}

As you can probably see, it is very difficult for me to create an easy example, because there are so many things happening behind, and so many classes and data types.

Other bonus stuff

I also uploaded an executable (hopefully it works), so that you can see what problem I am talking about:

https://dl.dropbox.com/u/24832466/Downloads/debug.zip

回答1:

I believe this has to do with a bit of a mix up between the "camera matrix" (world space position of the camera), and it's inverse matrix the "view matrix" (matrix which converts from world space to view space).

First, a little background.

You're starting with a world space position of the camera, and it's X, Y, and Z rotation. If this camera was just a typical object we were placing in the scene, we would set it up like this:

glTranslate(camX, camY, camZ);
glRotate(x);
glRotate(y);
glRotate(z);

All together these operations create the matrix I will define as "CameraToWorldMatrix", or "the matrix that transforms from camera space to world space".

However, when we're dealing with view matrices, we don't want to transform from camera space to world space. For the view matrix we want to transform coordinates from world space into camera space (the inverse operation). So our view matrix is really a "WorldToCameraMatrix".

The way you take the "inverse" of the "CameraToWorldMatrix" would be to perform all of the operations in the reverse order (which you came close to doing, but got the order slightly mixed up).

The inverse of the above matrix would be:

glRotate(-z);
glRotate(-y);
glRotate(-x);
glTranslate(-camX, -camY, -camZ);

Which is almost what you had, but you had the order mixed up.

In your code here:

Math::Matrix<float> r = Math::IdentityMatrix();
r *= Math::RotationMatrix(rot.X, 1.0f, 0, 0);
r *= Math::RotationMatrix(rot.Y, 0, 1.0f, 0);
r *= Math::RotationMatrix(rot.Z, 0, 0, 1.0f);

Vector3D<float> new_pos = Math::ApplyMatrix(tr, r);
currentCamera->SetPosition(currentCamera->Position() + new_pos);

You were defining the "CameraToWorldMatrix" as "first rotate around X, then Y, then Z, then translate".

However when you inverse this, you get something different than what you were using as your "WorldToCameraMatrix", which was (translate, then rotate around z, then rotate around y, then rotate around x).

Because your view matrix and camera matrix were not actually defining the same thing, they get out of sync and you get weird behavior.