Transformations in Computer Graphics



Transformations are central in computer graphics. They are used to map from one space to another along the graphics pipeline;

Affine Transforms

A very good source for affine maps in Gerald Farin's book [Far93], ``Curves and Surfaces for CAGD: A Practical Approach.'' Some of this introductory material comes from Farin's text.

Let $a,\,b\in{E}^3$ be two points in three dimensional Euclidean space E3. Their difference

\begin{displaymath}\vec{v}=b-a\in \mathbf{R}^3\end{displaymath}

is the vector from a to b in the three dimensional linear space R3. Vectors can be added, subtracted and multiplied by constants. (Note the restriction to E3 is not essential: We could generalize to any dimension Em, but three dimensions is most useful for computer graphics.)

Points be subtracted, but addition and scalar multiplication of points is not defined. Points can have a vector added to them to form another point:


This is a translation of point a along vector $\vec{v}$to point b.

Points determine position; vectors determine direction and magnitude. For any two points a and b there is but one vector $\vec{v}=b-a$from a to b. However, given $\vec{v}$ there are infinitely many pairs of points that determine $\vec{v}$. Indeed, if $\vec{v}=b-a$ and $\vec{w}$ is any vector, then

\begin{displaymath}\vec{v} = (b+\vec{w}) - (a+\vec{w}) = b-a.\end{displaymath}

Now let $a_0,\,a_1,\ldots,\,a_n$ be n+1 points in E3. And let $\alpha_0,\,\alpha_1,\ldots,\,\alpha_n$ be n+1 real numbers (weights) that sum to 1. We define the barycentric (or affine) combination of these points to be

\begin{displaymath}a=\sum_{j=0}^{n}\alpha_ja_j,\,a_j\in \mathbf{E}^3,\,\sum_{j=0}^{n}\alpha_j=1.\end{displaymath}

This looks like we've invalidated our statement that points can not be added, but the fact that the weights add to one allow us to write the barycentric combination as a point plus the sum of vectors. That is,

\begin{displaymath}a=a_0 + \sum_{j=1}^{n}\alpha_j(a_j-a_0).\end{displaymath}

An important special case of barycentric combinations are convex combinations. Here we require that the weighs be non-negative ($\geq 0$) as well as sum to 1.

Note that a weighted sum of points is a vector when the weights add up to zero.

Affine Maps

A map A that maps E3 into itself is called affine if it leaves barycentric combinations invariant. That is, pretend

\begin{displaymath}p=\sum_{j=0}^{n}\alpha_j a_j,\, p,\,a_j\in \mathbf{E}^3,\,

is a barycentric combination of points


and A is an affine map. Then

\begin{displaymath}pA =\sum_{j=0}^{n} \alpha_j a_jA,\, pA,\,a_jA\in \mathbf{E}^3\end{displaymath}

is a barycentric combination of points.

To be more specific, let's think of point p with coordinates $(x\:y\:z)$. An affine map can be represented in the familiar form

\begin{displaymath}pA = pM + \vec{v},\end{displaymath}

where M is a $3\times 3$ matrix and $\vec{v}$ is a (translation) vector in R3.

Note that we write point-matrix multiplication with the point on the left of the matrix: This seems common practice in the computer graphics literature. Placing the point on the right is more common in mathematical writing. It is easy to change from one form to the other via the transpose operation. We will write

\begin{displaymath}pM = \left(\begin{array}{ccc} x & y & z \end{array}\right)
... & m_{23} \\
m_{31} & m_{32} & m_{33} \\

This is equivalent to

\begin{displaymath}M^{T}p^{T} =
\left[\begin{array}{ccc} m_{11} & m_{21} & m_{3...
\left(\begin{array}{c} x \\ y \\ z \end{array}\right).\end{displaymath}

Thus the major difference is: we write points (and vectors) as rows, others write them as columns.

We will see that the useful tranformations: translations, scale, rotation, shear, and parallel projection are all affine maps.

Linear Interpolation

A particularly useful barycentric combination is linear interpolation. Let $a,\,b\in \mathbf{E}^3$ be two points and let the weights be 1-t and t for some real number (parameter) t. Then the points

\begin{displaymath}L=L(t) = (1-t)a+tb,\, t\in \mathbf{R}\end{displaymath}

is called the straight line through a and b. The line L(t) is a barycentric combination. If we restrict the parameter t to lie between zero and one ( $0\leq t \leq 1$), L(t) is a convex combination: It is the line segment from a to b. Note that there is a direction of travel implied along the line.


Matrices are the basic tool that transform (map) points from E3 into E3. A matrix is an $n\times m$ array with n rows and m columns. You need to know how to perform matrix multiplication. Most of our matrices will be $4\times 4$, but they'll start out as $3\times 3$.

Pretending that matrix multiplication is a collection inner products is useful since it provides a geometric interpretation. That is, the $(i,\,j)$ element in the product AB is the inner product of the i-th row of A with the j-th column of B. If you are uncertain about inner products, you'll want to read about them.

Rows and Columns

A row


should be thought of as a point (using our notational conventions). A column

\begin{displaymath}E=\left[\begin{array}{c}a \\ b \\ c \\ d\end{array}\right]\end{displaymath}

should be thought of as a plane. The inner (or dot, scalar, matrix) product of them

\begin{displaymath}P\cdot E = [x\:y\:z\:w]\left[\begin{array}{c}a \\ b \\ c \\ d\end{array}\right]=

is a scalar (real number). If the value is zero, the point lies in the plane.


Scaling alters the size of an object. Pretend you are given a point $p=(x\:y\:z)$ which is an object vertex, and let $s_x\:s_y\:s_z)$ be scale factors in $x\:y\:z$, respectively. Then the point can be scaled to a new point by the matrix

s_x & 0 & 0 \\
0 & s_y & 0 \\
0 & 0 & s_z \end{array}\right].\end{displaymath}

In particular,

\begin{displaymath}pS=\left(\begin{array}{ccc}x & y & z\end{array}\right)
s_x x & s_y y & s)z s\end{array}\right).\end{displaymath}

To scale (enlarge or shrink) the size of an object, each object vertex is multiplied by the scale matrix Sas shown above.

The Fixed Point of a Scale

Note that the origin $O=[0\:0\:0]$ is unchanged by a scale (it is still the origin). There is always one fixed point for any scaling operation. By default the fixed point is the origin $O=(0\:0\:0)$, but we can select an arbitrary fixed point $F=[x_f\: y_f\:z_f]$by the following three step process, which will be more completely defined below.

Translate $F=[x_f\: y_f\:z_f]$ to $O=[0\:0\:0]$
Scale by $[s_x\:s_y\:s_z]$
Translate $O=[0\:0\:0]$ to $F=[x_f\: y_f\:z_f]$

The Inverse of a Scale

As long as we do not scale by zero, a scale can always be inverted (undone) by the matrix

\frac{1}{s_x} & 0 & 0 \\
0 & \frac{1}{s_y} & 0 \\
0 & 0 & \frac{1}{s_z} \end{array}\right].\end{displaymath}

The product SS-1= S-1S=I, the $3\times 3$ indentity matrix.


Rotations alter the orientation of an object: They are a little more complex than scales. Starting in two dimensional rotations is easiest.

Rotations in Two Dimensions

A rotation moves a point along a cirular path centered at the origin (the pivot). It is a simple trigonometry problem to show that rotating $P=[x\:y]$ counter-clockwise by $\theta$ radians produces a new point $P'=[x'\:y']$ given by

\begin{eqnarray*}x' & = & x\cos\theta - y\sin\theta \\
y' & = & y\cos\theta + x\sin\theta \\

For example, pretend $P=[1\:1]$ and $\theta = \pi/2$. Then $P'=[-1\:1]$, which you should agree correctly matches the description.

Of course, we can express the rotation in matrix form

\begin{displaymath}[x'\quad y'\quad 1]= [x \quad y \quad 1]\left[\begin{array}{r...
... -\sin\theta & \cos\theta & 0 \\
0 & 0 & 1 \end{array}\right]\end{displaymath}

The Pivot of a Rotation

By default the pivot point is the origin $o=[0\:0\:0]$, but we can arrange for an arbitrary pivot $P=[x_p\:y_p]$ by using a three step process similar to the one for scaling about an arbitrary fixed point described about.

Translate $P=[x_p\:y_p]$ to $O=[0\:0]$
Rotate by $\theta$
Translate $0=[0\:0]$ to $P=[x_p\:y_p]$

Rotations in Three Dimensions

In three dimensions points are rotated about an axis, which is a line in three dimensional space. There are three principle axes: the x, y, and z axes. We assume a right-handed coordinate system, with the convention that positive rotation is counter-clockwise.

Rotating $P=[x\:y\:z]$ about the z-axis by $\theta$ radians produces a new point $P'=[x'\:y'\:z']$ where

\begin{eqnarray*}x' & = & x\cos\theta - y\sin\theta \\
y' & = & x\sin\theta + y\cos\theta \\
z' & = & z \\

or in matrix notation

\begin{displaymath}[x'\: y'\: z']= [x \: y \: z]\left[\begin{array}{rrr}
0 & 0 & 1 \end{array}\right]=[x\:y\:z]R_z = [x\:y\:z]R_{xy}.\end{displaymath}

The notations Rz and Rxy are meant to be mneumonics for ``rotate about z'' and ``rotate from x toward y.''

Rotating $P=[x\:y\:z]$ about the x-axis by $\theta$ radians produces a new point $P'=[x'\:y'\:z']$ where

\begin{eqnarray*}x' & = & x \\
y' & = & y\cos\theta - z\sin\theta \\
z' & = & y\sin\theta + z\cos\theta \\

or in matrix notation

\begin{displaymath}[x'\: y'\: z']= [x \: y \: z]\left[\begin{array}{rrr}
1 & 0 ...
...s\theta \\ \end{array}\right]
=[x\:y\:z]R_x = [x\:y\:z]R_{yz}.\end{displaymath}

Rotating $P=[x\:y\:z]$ about the y-axis by $\theta$ radians produces a new point $P'=[x'\:y'\:z']$ where

\begin{eqnarray*}x' & = & x\cos\theta + z\sin\theta \\
y' & = & y \\
z' & = & - x\sin\theta + z\cos\theta

or in matrix notation

\begin{displaymath}[x'\: y'\: z']= [x \: y \: z]\left[\begin{array}{rrr}
...s\theta \\ \end{array}\right]
=[x\:y\:z]R_y = [x\:y\:z]R_{zx}.\end{displaymath}

Angles of rotation about the principle axes are called Euler angles.

Rotation About an Arbitrary Axis

Consider an axis through the origin determined by a unit length direction vector $\vec{D}=\langle d_x\:d_y\:d_z\rangle$, (the completely arbitrary case will be easily handled after translations are introduced). We can arrange to rotate by theta radians about this axis using a five step process

Rotate $\vec{D}=\langle d_x\:d_y\:d_z\rangle$ into the xz plane, call the result $\vec{D'}=\langle d'_x\:0\:d'_z\rangle$
Rotate $\vec{D'}=\langle d'_x\:0\:d'_z\rangle$ into the z axis
Rotate about the z axis by $\theta$ radians
Invert the rotation of $\vec{D'}=\langle d'_x\:0\:d'_z\rangle$ into the z axis
Invert the rotation of into $\vec{D}=\langle d_x\:d_y\:d_z\rangle$ $\vec{D'}=\langle d'_x\:0\:d'_z\rangle$
This is messy and error-prone when using hand calculation, however, if you carry it out, the result is

R= \left[\begin{array}{ccc}
d_x^2+\cos\theta(1-d_x^2) & d_xd...
...\sin\theta &
d_z^2+\cos\theta(1-d_z^2) \\ \end{array}\right].
\end{displaymath} (1)

You should verify that this matrix reduces to rotations about z, x, and y for appropriate choices of the direction vector $\vec{D}$Below, we'll see that there are better ways to derive this matrix.

The Inverse of a Rotation

The inverse of a rotation by $\theta$ radians can be created by rotating by $-\theta$ radians, but this is not the best way to view it. Consider the trigonometic identities

\begin{eqnarray*}\cos(-\theta) & = & \cos\theta \\
\sin(-\theta) & = & -\sin\theta \\

If you plug these into the arbitrary rotation from equation (1), you'll see that the inverse of R is the transpose of R. This is an important observation.


Translations change the position of an object. A pure (three dimensional) translation can not be implemented using a $3\times$ matrix: It is an affine map. We must alter our notion of a point to accommodate translations. A three dimensional point $P=[x\:y\:z]$ will be embedded in three dimensional homogeneous space and represented as a 4-tuple $P_h=[x\:y\:z\:w]$. For now, the homogeneous coordinate w will have the fixed value 1. This allows us to implement translations using $4\times 4$ matrices, in particular, the matrix

\begin{displaymath}T= \left[\begin{array}{cccc}
1 & 0 & 0 & 0 \\
0 & 1 & 0 & 0 \\
0 & 0 & 1 & 0 \\
t_x & t_y & t_z & 1 \\

translated the point $P_h = [x\:y\:z\:1]$ into the point $P'_h = [x+t_x\:y+t_y\:z+t_z\:1]$.

The Inverse of a Translation

To undo a translation by $t_x,\,t_y,\,t_z$ use the matrix

\begin{displaymath}T^{-1}= \left[\begin{array}{cccc}
1 & 0 & 0 & 0 \\
0 & 1 & 0...
... & 0 & 1 & 0 \\
-t_x & -t_y & -t_z & 1 \\

We can now complete scaling about an arbitrary fixed point and rotation about an arbitrary pivot. To scale about $F=[x_f\: y_f\:z_f]$ use the composition of matrices

1 & 0 & 0 & 0 \\
0 & 1 & 0 & 0 \\ ...
0 & 0 & 1 & 0 \\
x_f & y_f & z_f & 1 \\

which when multiplied out yields

s_x & 0 & 0 & 0 \\
0 & s_y & 0 & 0...
..._f(1-s_x) & y_f(1-s_y) & z_f(1-s_z) & 1 \\

So a scaled point $[x\:y\:z\:1]$ becomes
$\displaystyle [x'\:y'\:z'\:1]$ = $\displaystyle [x\:y\:z\:1]\left[\begin{array}{cccc}
s_x & 0 & 0 & 0 \\
0 & s_y...
...0 & s_z & 0 \\
x_f(1-s_x) & y_f(1-s_y) & z_f(1-s_z) & 1 \\
\end{array}\right]$ (2)
  = $\displaystyle [xs_x + x_f(1-s_x) \quad ys_y+y_f(1-s_y) \quad zs_z+z_f(1-s_z) \: 1]$ (3)

In a similar manner you can determine that rotation about a pivot $R=[x_r\; y_r]$ results in

\begin{eqnarray*}x' & = & x_r + (x-x_r)\cos\theta - (y-y_r)\sin\theta \\
y' & = & y_r + (y-y_r)\cos\theta + (x-x_r)\sin\theta \\

Efficiency of Matrix Multiplication

Now is a good time to mention the fact that it is more efficient, in general, to form one composite transform than to pass a sequence of points through one transform, then another, and another, and so on.

Let's see why this is. Multiplying one point (a 4-tuple) by a transformation ($4\times 4$ matrix) costs 16 multiplies and 12 additions. Therefore, transforming an object with n vertices by one transform costs 16n multiplies and 12n additions.

On the other hand, multiplying two $4\times 4$ matrices costs 64 multiplies and 48 additions. So compositing m $4\times 4$ matrices together costs 64(m-1) multiplies and 48(m-1) additions.

So consider the alternatives:

Multiply n vertices through a sequence of m transformations at a cost of 16n multiplies and 12n adds per transform. The total cost will be

\begin{displaymath}16nm \quad \mbox{multiplies},\quad 12nm \quad \mbox{additions}.\end{displaymath}

Form on composite matrix and pass n vertices through it. The total cost will be

\begin{displaymath}64(m-1) + 16n \quad \mbox{multiplies},\quad 48(m-1) + 12n \quad \mbox{additions}.\end{displaymath}

Where Are Transformations Used?

Objects defined in model space can be scaled, translated, and rotated into world space and then viewed from any position. The map from model to world to view is most often concatenated into one single tranform so we map from a model directly into view space without every stopping in the world.

Next we map objects from view space into perspective space. This involves projections: either parallel or perspective. Note that we must stop in view space to compute the illumination that lights bring to our view.

From perspective to clip space and from clip space through normalized space to device space are fairly straight forward scales and translations -- we just need to be careful not to introduce distortions by our scaling of the objects.

Non-linear transforms

Here we want to describe perspective transforms. They are non-linear, that is, lines do not map into lines. (To be completed)



Gerald Farin.
Curves and Surfaces for Computer Aided Geometric Design.
Academic Press, third edition, 1993.

William D. Shoaff