第四章各种变换的原理---GL版本

最新推荐文章于 2024-09-05 10:03:06 发布

seamanj

最新推荐文章于 2024-09-05 10:03:06 发布

阅读量2.4k

点赞数 2

分类专栏： 3D下的各种数学《The Cg Tutorial》 GL

本文链接：https://blog.youkuaiyun.com/seamanj/article/details/8606345

版权

3D下的各种数学同时被 3 个专栏收录

6 篇文章

订阅专栏

6 篇文章

订阅专栏

《The Cg Tutorial》

4 篇文章

订阅专栏

The perspective projection maps the view frustum to the cube representing homogeneous clip space.

Homogeneous clip space is so named because it is in this space that graphics primitives are clipped to the boundaries of the visible region of the scene,ensuring that no attempt is made to render any part of a primitive that falls outside the viewport.

In homogeneous clip space,vertices have vertices have normalized device coordinates.The term normalized pertains to the fact that the x,y,and z coordinates of each vertex fall in the range [-1,1],but reflect the final positions in which they will appear in the viewport.The vertices must undergo one more transformation,called the viewport transformation,that maps the normalized coordinates to the actual range of pixel coordinates covered by the viewport.The z coordinate is usually mapped to the floating-point range[0,1],but this is subsequently scaled to the integer range corresponding to the number of bits per pixel utilized by the depth buffer.After the viewport transformation ,vertex positions are said to lie in window space.

OpenGL 提供了glFrustum 和 gluPerspective 来实现透视投影。提供了glOrho来实现正投影

投影平面可以选择任何平行于近平面的平面，OpenGL已经规定投影平面就是近平面。

先来看下glFrustum函数的声明

void glFrustum(GLdouble left,GLdouble right,GLdouble bottom,GLdouble top,GLdouble near,GLdouble far);

注意：这个函数的投影平面在z=-n处

接下来说明为什么1/z满足线性插值

下面说明为什么x',y'必须要满足线性插值

由于三维物体需要投影到二维投影平面上，最终显示在窗口上。GPU是根据二维窗口坐标进行线性相关插值（本质是线性插值）（为什么叫线性相关插值？后面会解释）

由于线性变换不影响线性插值（后面会证明），再加上从投影坐标（x',y',z')到标准设备坐标(x'',y'',z'')于到窗口坐标(x''',y'''',z''')都是线性变换。由于光栅化需要(x''',y''',z''')进行线性插值

，所以需要（x'',y'',z''）满足线性插值。从而（x',y'）也必须满足线性插值。注意：z''是1/z的函数与z'无关。所以我们必须规定x',y'满足线性插值。

这里证明线性变换不影响线性插值

接下来说明为什么叫线性相关插值

因为有些插值方法并不是直接的线性插值方法，但是本质上还是线性插值

在光栅化的过程中需要给三角形上色，依次迭代三角形外切矩形里的每一像素。伪代码如下：

xmin= floor(xi)
xmax= ceiling(xi)
ymin= floor(yi)
ymax= ceiling(yi)
for all y = ymin to ymax do
for all x = xmin to xmax do
alpha = f12(x,y) / f12(x0,y0)
beta = f20(x,y) / f20(x1,y1)
gamma = f01(x,y) / f01(x2,y2)
if ( alpha > 0 and beta > 0 and gamma > 0 ) then
c = alpha*c0 + beta*c1 + gamma*c2
drawpixel(x,y) with color c

下面看看gluPerspective函数

void gluPerspective(GLdouble fovy,GLdouble aspect,GLdouble near,GLdouble far)

这函数仍然选 z=-n为投影平面

注意：fovy是yz平面的垂直视野角度。

下面是这个建立这个矩阵的函数

static const double myPi = 3.14159265358979323846;
static void buildPerspectiveMatrix(double fieldOfView,
                                   double aspectRatio,
                                   double zNear, double zFar,
                                   float m[16])
{
  double sine, cotangent, deltaZ;
  double radians = fieldOfView / 2.0 * myPi / 180.0;
  
  deltaZ = zFar - zNear;
  sine = sin(radians);
  /* Should be non-zero to avoid division by zero. */
  assert(deltaZ);
  assert(sine);
  assert(aspectRatio);
  cotangent = cos(radians) / sine;

  m[0*4+0] = cotangent / aspectRatio;
  m[0*4+1] = 0.0;
  m[0*4+2] = 0.0;
  m[0*4+3] = 0.0;
  
  m[1*4+0] = 0.0;
  m[1*4+1] = cotangent;
  m[1*4+2] = 0.0;
  m[1*4+3] = 0.0;
  
  m[2*4+0] = 0.0;
  m[2*4+1] = 0.0;
  m[2*4+2] = -(zFar + zNear) / deltaZ;
  m[2*4+3] = -2 * zNear * zFar / deltaZ;
  
  m[3*4+0] = 0.0;
  m[3*4+1] = 0.0;
  m[3*4+2] = -1;
  m[3*4+3] = 0;
}

接着看看正交投影glOrtho函数

glOrtho（GLdouble left,GLdouble right,GLdouble bottom,GLdouble top,GLdouble near,GLdouble far)

接下来解释为什么z满足线性插值

再来看看gluLookAt函数,这个函数提供的是视图变换

gluLookAt(GLdouble eyex,GLdouble eyey,GLdouble eyez,GLdouble centerx,GLdouble centery,GLdouble centerz,GLdouble upx,GLdouble upy,GLdouble upz)

下面给出gluLookat的实现代码

/* Build a row-major (C-style) 4x4 matrix transform based on the
   parameters for gluLookAt. */
static void buildLookAtMatrix(double eyex, double eyey, double eyez,
                              double centerx, double centery, double centerz,
                              double upx, double upy, double upz,
                              float m[16])
{
  double x[3], y[3], z[3], mag;

  /* Difference eye and center vectors to make Z vector. */
  z[0] = eyex - centerx;
  z[1] = eyey - centery;
  z[2] = eyez - centerz;
  /* Normalize Z. */
  mag = sqrt(z[0]*z[0] + z[1]*z[1] + z[2]*z[2]);
  if (mag) {
    z[0] /= mag;
    z[1] /= mag;
    z[2] /= mag;
  }

  /* Up vector makes Y vector. */
  y[0] = upx;
  y[1] = upy;
  y[2] = upz;

  /* X vector = Y cross Z. */
  x[0] =  y[1]*z[2] - y[2]*z[1];
  x[1] = -y[0]*z[2] + y[2]*z[0];
  x[2] =  y[0]*z[1] - y[1]*z[0];

  /* Recompute Y = Z cross X. */
  y[0] =  z[1]*x[2] - z[2]*x[1];
  y[1] = -z[0]*x[2] + z[2]*x[0];
  y[2] =  z[0]*x[1] - z[1]*x[0];

  /* Normalize X. */
  mag = sqrt(x[0]*x[0] + x[1]*x[1] + x[2]*x[2]);
  if (mag) {
    x[0] /= mag;
    x[1] /= mag;
    x[2] /= mag;
  }

  /* Normalize Y. */
  mag = sqrt(y[0]*y[0] + y[1]*y[1] + y[2]*y[2]);
  if (mag) {
    y[0] /= mag;
    y[1] /= mag;
    y[2] /= mag;
  }

  /* Build resulting view matrix. */
  m[0*4+0] = x[0];  m[0*4+1] = x[1];
  m[0*4+2] = x[2];  m[0*4+3] = -x[0]*eyex + -x[1]*eyey + -x[2]*eyez;

  m[1*4+0] = y[0];  m[1*4+1] = y[1];
  m[1*4+2] = y[2];  m[1*4+3] = -y[0]*eyex + -y[1]*eyey + -y[2]*eyez;

  m[2*4+0] = z[0];  m[2*4+1] = z[1];
  m[2*4+2] = z[2];  m[2*4+3] = -z[0]*eyex + -z[1]*eyey + -z[2]*eyez;

  m[3*4+0] = 0.0;   m[3*4+1] = 0.0;  m[3*4+2] = 0.0;  m[3*4+3] = 1.0;
}

接下来解释为什么旋转矩阵是正交矩阵。

Every rigid, orientation-preserving, linear transformation is a rotation.

every rotation is obviously a rigid, orientation-preserving,linear transformation.

下面给出他们的概念：

A transformation on R^2 is any mapping A : R^2 → R^2. That is, each point x ∈ R^2 is mapped
to a unique point, A(x), also in R^2.
Definition Let A be a transformation. A is a linear transformation provided the following two
conditions hold:
1. For all α ∈ R and all x ∈ R^2, A(αx) = αA(x).
2. For all x, y ∈ R^2, A(x + y) = A(x) + A(y).
Note that A(0) = 0 for any linear transformation A. This follows from condition 1 with α = 0.

One simple, but important, kind of transformation is a “translation,” which changes the
position of objects by a fixed amount but does not change the orientation or shape of geometric
objects.

Definition A transformation A is a translation provided that there is a fixed u ∈ R^2 such that
A(x) = x + u for all x ∈ R^2.
The notation Tu is used to denote this translation, thus T_u(x) = x + u.

The composition of two transformations A and B is the transformation computed by first
applying B and then applying A. This transformation is denoted A ◦ B, or just AB, and satisfies
(A ◦ B)(x) = A(B(x)).
The identity transformation maps every point to itself. The inverse of a transformation A is
the transformation A^(−1) such that A ◦ A^(−1) and A^(−1) ◦ A are both the identity transformation.
Not every transformation has an inverse, but when A is one-to-one and onto, the inverse
transformation A^(−1) always exists.
Note that the inverse of T_u is T_−u.

Definition A transformation A is affine provided it can be written as the composition of a
translation and a linear transformation. That is, provided it can be written in the form A = TuB
for some u ∈ R^2 and some linear transformation B.

In other words, a transformation A is affine if it equals
A(x) = B(x) + u,
with B a linear transformation and u a point.

Because it is permitted that u = 0, every linear transformation is affine. However, not every
affine transformation is linear. In particular, if u = 0, then transformation II.1 is not linear
since it does not map 0 to 0.

从这里我们知道，线性变换能表示缩放、挤压、旋转、反射（对于轴）、反射（对于点，其实就是后面说的通用旋转，generalized rotation）

而仿射变换除此之外还能表示平移，他是线性变换和平移变换的组合。

（Any affine transformation is the composition of a linear transformation and a translation.）

Proposition II.1 Let A be an affine transformation.The translation vector u and the linear
transformation B are uniquely determined by A.
Proof First, we see how to determine u from A.We claim that in fact u = A(0). This is proved
by the following equalities:
A(0) = T_u(B(0)) = T_u(0) = 0 + u = u.
Then B = T^( −1)_u A = T_(−u)A, and so B is also uniquely determined.

II.1.2 Matrix Representation of Linear Transformations
The preceding mathematical definition of linear transformations is stated rather abstractly.
However, there is a very concrete way to represent a linear transformation A – namely, as a
2 × 2 matrix.

注意这里的矩阵跟gluLookat建立的矩阵刚好互逆，因为这里是变换前后的坐标都是以变换前的坐标系为准的，而gluLookat变换后是以变换后的坐标系为准的。比如：原先坐标系中的（1，0）变成原先坐标系中的（0，1），那么新坐标系中的（1，0）就是原先坐标系中（0，1）变来的。

可以看到这里从（1，0）变成了（0，1），而gluLookat却把（0，1）变成了（1，0），所以它们是互逆的。

A rigid transformation is a transformation that only repositions objects, leaving their shape and
size unchanged. If the rigid transformation also preserves the notions of “clockwise” versus
“counterclockwise,” then it is orientation-preserving.

Definition： A transformation is called rigid if and only if it preserves both
1. Distances between points, and
2. Angles between lines.

A rigid transformation is one that preserves the size and shape of an object and changes
only its position and orientation.

所以刚体变换包括平移、旋转、反射（对于轴，我们把对于点纳入旋转范围内，后面说的反射都是对于轴）。

The transformation is said to be orientation-preserving if it preserves the direction of angles,
that is, if a counterclockwise direction of movement stays counterclockwise after being
transformed by A.

We define an orientation-preserving transformation to be one that preserves “righthandedness.”
Formally, we say that A is orientation-preserving provided that (A(u) × A(v)) ·A(u × v) > 0 for all noncollinear u, v ∈ R3.

这是对三维来说的，因为对于二维没有叉乘，但是二维也有朝向保持，可以通过下面这条来验证。

Let M = (u, v).Let u‘ be u rotated counterclockwise 90..Then M is orientation-preserving if and only if u’· v > 0.

朝向保持就是指变换前向量之间满足某一定则（比如：右手定则）则变换后也要满足右手定则，也就是变换前是右手坐标系，变换后也是是右手坐标系。

旋转保持了朝向，而反射却没有。

这里说的就是旋转或反射，因为线性变换包括缩放、挤压、旋转、反射，而刚体变换包括旋转和、平移、反射。

旋转是保持了朝向的，而反射却没有。

the linear transformation represented by the matrix M is rigid if and only if ||u|| = ||v|| = 1, and u · v = 0. (通过内积保持来证明)

if M represents a rigid transformation, then det(M) = ±1.（平移这里就不考虑了，因为始终为E）

下面证明

根据前一个结论我们知道M的转置MT乘以M等于E，所以为正交矩阵（为什么不用M乘以MT,因为这里是列向量为单位向量相互正交。得到结论后可以推出行向量也为单位向量且相互正交）

MT*M=E 所以|M|^2=1 .得证

上面的M就是正交矩阵。

这里说明了旋转和反射（只对于轴）的变换矩阵都是正交矩阵。如何判断是旋转还是反射，看下面的命题。

the linear transformation represented by the matrix M is orientation-preserving if and only if det(M) > 0.

证明：

前面我们提到

Let M = (u, v).Let u‘ be u rotated counterclockwise 90..Then M is orientation-preserving if and only if u’· v > 0.

可以用来验证朝向一致性问题

u'=<-u2,u1> u'·v=u1v2-u2v1

|M|=u1v2-u2v1=u'·v 得证。

到这里也就是说，如果变换矩阵为正交矩阵，并且它对应的行列式为1，则为旋转（因为旋转保持朝向）；如果为－1，则为反射（因为反射不保持朝向）。

Every rigid, orientation-preserving, linear transformation is a rotation.

every rotation is obviously a rigid, orientation-preserving,linear transformation.

所以，我们的最终目的已经达到了。旋转变换矩阵是一个正交矩阵，并且还进一步得到它对应的行列式的值为1。

为了完整接着把仿射讲完。

Every rigid, orientation-preserving, affine transformation can be (uniquely) expressed as the composition of a translation and a rotation.

（注意上面是二维下说的）

Every rigid, orientation-preserving, affine transformation is either a translation or a generalized rotation.(就是旋转加上了平移)

Obviously, the converse of this theorem holds too.

到这里总算完成了解释为什么旋转矩阵是正交矩阵。

接下来我们需要看下视口（viewport）变换（注意跟视图(view)变换区别开来）。

Opengl提供了glViewport函数来实现视口变换

void glViewport(GLint Ox,GLint Oy,GLsizei width,GLsizei height)

(Ox,Oy)指定了视口的左下角

所以，视口变换

就是将

x''->x''':(-1,1)->(Ox,Ox+width)

y''->y''':(-1,1)->(Oy,Oy+height)

z''->z''':(-1,1)->(MinZ,MaxZ) 其中MinZ,MaxZ是glDepthRange函数指定的参数,一般为0,1（稍后会详解）

由于(x'',y'',z'')满足线性插值，所以设

x'''=A'x''+B'

y'''=A''y''+B''

z'''=A'''z''+B'''

代入变换条件可求得：

A'=width/2 B'=width/2+Ox

A''=height/2 B''=heiht/2+Oy

A'''=(MaxZ-MinZ)/2 B'''=(MaxZ+MinZ)/2

所以视口变换的矩阵为

接下来讲解下glDepthRange函数

先让我们看前面的一段话

In homogeneous clip space,vertices have normalized device coordinates.The term normalized pertains to the fact that the x,y,and z coordinates of each vertex fall in the range [-1,1],but reflect the final positions in which they will appear in the viewport.The vertices must undergo one more transformation,called the viewport transformation,that maps the normalized coordinates to the actual range of pixel coordinates covered by the viewport.The z coordinate is usually mapped to the floating point range[0,1],but this is subsequently scaled to the integer range corresponding to the number of bits per pixel utilized by the depth buffer.After the viewport transformation ,vertex positions are said to lie in window space.

另外，看一段重要的问题

Is the depth coordinate he same as the Z-coordinate you originally specify in glVertex?

No,It isn't.Object Coordinates are transformed by the ModelView matrix to produce Eye Coordinates.Object Coodinates are the raw Coordinates you submit to OpenGL with a call to glVertex*() or glVertexPointer().They represent the coordinates of your object or other geometry you want to render.So Z-coordinate there is only your third dimension Object Coordinate.The "depth" value(the Z-coordinate in Window Space) is the distance from your Camera position to the rasterized pixel(即投影平面）.Only that it is not the actual distance(in Eye Coordinate units),but it typically is a value between 0 and 1,where 0 is exactly on your near-plane and 1 is exactly on your far-plane.

the depth value that fall into the clip region are typically from 0,0 to 1.0.

上面的红字表示了窗口z坐标通常为映射到[0,1]，如果不想映射到[0,1]就需要用到我们提到的glDepthRange函数

void glDepthRange(GLclampd near, GLclampd far);
Defines an encoding for z-coordinates that’s performed during the
viewport transformation. The near and far values represent adjustments
to the minimum and maximum values that can be stored in the depth
buffer. By default, they’re 0.0 and 1.0, respectively, which work for most
applications. These parameters are clamped to lie within [0, 1].

最终，窗口z坐标会被GPU转换为整数存储在深度缓冲区内。

The depth value is converted to an integer and finally written to the depth buffer by the GPU.The depth buffer is typically stored as 16 bit integer or 24 bit integer or 32 bit integer.

为什么会这样呢？

先看透视变换中z的变换

注意z是摄像机空间中的坐标

z->z'':(-n,-f)->(-1,1)

对z''求导可得dz''/dz=2fn/(f-n)*(-1/z^2)

所以当z从-n变到-f的时候，dz''/dz是逐渐减小的，也就是说深度缓冲区精度越来越低。

接下来看看3D中的旋转

OpenGL 提供了glRotate来实现旋转

void glRotate{fd}(TYPE angle,TYPE x,TYPE y,TYPE z)

其中angle为度数

Figure 4.4. Rotation about an arbitrary axis.

下面给出旋转矩阵建立的代码：

static void makeRotateMatrix(float angle,
                             float ax, float ay, float az,
                             float m[16])
{
  float radians, sine, cosine, ab, bc, ca, tx, ty, tz;
  float axis[3];
  float mag;

  axis[0] = ax;
  axis[1] = ay;
  axis[2] = az;
  mag = sqrt(axis[0]*axis[0] + axis[1]*axis[1] + axis[2]*axis[2]);
  if (mag) {
    axis[0] /= mag;
    axis[1] /= mag;
    axis[2] /= mag;
  }

  radians = angle * myPi / 180.0;
  sine = sin(radians);
  cosine = cos(radians);
  ab = axis[0] * axis[1] * (1 - cosine);
  bc = axis[1] * axis[2] * (1 - cosine);
  ca = axis[2] * axis[0] * (1 - cosine);
  tx = axis[0] * axis[0];
  ty = axis[1] * axis[1];
  tz = axis[2] * axis[2];

  m[0]  = cosine + (1 - cosine) * tx;
  m[1]  = ab - axis[2] * sine;
  m[2]  = ca + axis[1] * sine;
  m[3]  = 0.0f;
  m[4]  = ab + axis[2] * sine;
  m[5]  = cosine + (1 - cosine) * ty;
  m[6]  = bc - axis[0] * sine;
  m[7]  = 0.0f;
  m[8]  = ca - axis[1] * sine;
  m[9]  = bc + axis[0] * sine;
  m[10] = cosine + (1 - cosine) * tz;
  m[11] = 0;
  m[12] = 0;
  m[13] = 0;
  m[14] = 0;
  m[15] = 1;
}

OK，先讲到这里吧。本来还有矩阵的求逆变换，到时候我会另开一专题，专门讲解各种求矩阵逆的方法以及它们的实现。敬请期待哟！