What's the purpose of magic 4 of last row in m

When I read the book about WebGL, I've seen the next matrix description:

There is an information about the last row in book (WebGL Beginner's Guide Beginner's Guide Diego Cantor, Brandon Jones):

The mysterious fourth row The fourth row does not bear any special meaning. Elements m4, m8, m12 are always zero. Element m 16 (the homogeneous coordinate) will always be 1.

So, if the last row is always [ 0, 0, 0, 1 ], I don't understand the next:

Why is it necessary be strictly [ 0, 0, 0, 1 ], why not just all the values be 0 or even some other value?

But, if to view the source code of glMatrix javascript library, exactly the method translate() from the mat4 https://github.com/toji/gl-matrix/blob/master/src/gl-matrix/mat4.js

You're able to see the next:

/**
 * Translate a mat4 by the given vector not using SIMD
 *
 * @param {mat4} out the receiving matrix
 * @param {mat4} a the matrix to translate
 * @param {vec3} v vector to translate by
 * @returns {mat4} out
 */
mat4.scalar.translate = function (out, a, v) {
    var x = v[0], y = v[1], z = v[2],
        a00, a01, a02, a03,
        a10, a11, a12, a13,
        a20, a21, a22, a23;

    if (a === out) {
        out[12] = a[0] * x + a[4] * y + a[8] * z + a[12];
        out[13] = a[1] * x + a[5] * y + a[9] * z + a[13];
        out[14] = a[2] * x + a[6] * y + a[10] * z + a[14];
        out[15] = a[3] * x + a[7] * y + a[11] * z + a[15];
    } else {
        a00 = a[0]; a01 = a[1]; a02 = a[2]; a03 = a[3];
        a10 = a[4]; a11 = a[5]; a12 = a[6]; a13 = a[7];
        a20 = a[8]; a21 = a[9]; a22 = a[10]; a23 = a[11];

        out[0] = a00; out[1] = a01; out[2] = a02; out[3] = a03;
        out[4] = a10; out[5] = a11; out[6] = a12; out[7] = a13;
        out[8] = a20; out[9] = a21; out[10] = a22; out[11] = a23;

        out[12] = a00 * x + a10 * y + a20 * z + a[12];
        out[13] = a01 * x + a11 * y + a21 * z + a[13];
        out[14] = a02 * x + a12 * y + a22 * z + a[14];
        out[15] = a03 * x + a13 * y + a23 * z + a[15];
    }

    return out;
};

I shall highlight the line:

out[15] = a03 * x + a13 * y + a23 * z + a[15];

The last one ( the homogeneous coordinate ) is modifying, so it could be not equal 1.0?

So, I rather don't understand...

I see, that internal 3x3 matrix represents rotations and [ m13, m14, m15 ] is a translation vector for the changing the origin position of camera, but what's about the last row and why sometimes I see some calculations on it in libraries?

Also I suppose there is some kind of magic 3 for the 3x3 matrix which is used for the 2D-transformations, am I right?

标签： opengl matrix camera webgl linear-algebra

1条回答

ゆ、 Hurt°

2楼-- · 2019-04-30 06:15

Lets start with a bit of theory:

In general, all transformations in OpenGL are mappings between different vector spaces. This means that a transformation t takes an element from space V and maps it to it's corresponding element in space W, which can be written as

t: V ---> W

One of the simplest mappings is a linear map, which can (under some assumptions**) always be represented by a matrix. The dimension of the matrix is always given by the dimension of the vector spaces we are working in, thus a mapping from R^N to R^M will always look like this:

t: R^N ---> R^M
t(x) = A * x, A = R^(N,M)

Where A is a N times M dimensional matrix.

In OpenGL, we normally need mappings from R^3 to R^3 which means that linear mappings will always be represented by a 3x3 matrix. Using this, one can express at least rotations, scalings (and combinations of this***). But when looking at (for example) translations, we see that there is no way how they can be represented using a 3x3 matrix, so we have to extend our transformations to also support this operations.

This can be achieved by using affine mappings instead of linear ones, which are defined as

t: R^N ---> R^M
t(x) = A * x + b,  A = R^(N,M) is a linear transformation and  b = R^M

Using this we can express rotations, scalings and transformations from R^3 to R^3 by specifying a 3x3 matrix plus a 3D vector. Since this formulation is not very handy (requires a matrix and a vector, hard to combine multiple transformations), one normally stores the operation in a matrix of dimension N+1, which is called augmented matrix (or augmented vector spaces):

t: R^N ---> R^M

         -A-  b       x
t(x) = [        ] * [   ]
         -0-  1       1

As you can see, the last line of the matrix is always zero, except the rightmost element which is one. This also guarantees, that the last dimension of the result t(x) is always 1.

Why is it necessary be strictly [ 0, 0, 0, 1 ], why not just all the values be 0 or even some other value?

If we wouldn't restrict the last row to be exactly [0,0,0,1], we would not have an augmented affine mapping in R^3 anymore, but a linear mapping in R^4. Since in OpenGL R^4 is not really relevant and we want to keep translations included, the last row is fixed. Another point is, that when the last row is different, combining affine mappings by matrix multiplication would not work.

One problem left is, that we are still not able to express (perspective) projections by using affine mappings. When looking at a perspective projection matrix in OpenGL, one will notice that here the last row is not [0,0,0,1], but the theory behind this is a totally different story (if you are interested have a look here or here).

What's about the last row and why sometimes I see some calculations on it in libraries? The last one ( the homogeneous coordinate ) is modifying, so it could be not equal 1.0?

As already said, the last row is only [0,0,0,1] for affine mappings, not for projective ones. But sometimes it makes sense to apply transformations after a projection (for example moving the projected image on screen), then the last row of the matrix has to be respected. That's why most matrix libraries implement all operations in a way that allows for general matrices. The line

out[15] = a03 * x + a13 * y + a23 * z + a[15];

Will result in 1 as long as the last row (a03, a13, a23, a[15]) equals [0,0,0,1].

Since this post already got a lot longer than I thought, I'll better stop here, but if you have any further questions, just ask and I will try to add something to the answer.

Footnotes:

** Works when both spaces are finite-dimensional vector spaces and a basis is defined for both of them.

*** Combinations, since the combination of linear transformations over a finite-dimensional space is also linear, e.g., t: R^N -> R^M, u: R^M -> R^K, both linear => t(u(x)) linear

0人赞添加讨论(0) 举报

What's the purpose of magic 4 of last row in m

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间