The Perspective Space to Clip Space Map

William D. Shoaff with lots of help


Contents

The Perspective Space to Clip Space Map

The view-to-perspective map transforms a region in view space defined by a field of view angle $\alpha$, near, and far planes into the truncated pyramid (frustum)

\begin{displaymath}-w\leq x \leq w,\quad -w\leq y \leq w,\quad 0\leq z \leq w\end{displaymath}

in perspective space. Specifically, a point $x_v,\,y_v,\,z_v,\,1)$ in view space is transformed into the point $(x,\,y,\,z,\,w)$ in perspective space where $w=z_{v}\sin(\alpha)$.

After a divide by the homogeneous coordinate w, we know that points inside the viewing frustum will satisfy

\begin{displaymath}-1\leq x \leq 1,\quad -1\leq y \leq 1,\quad 0\leq z \leq 1.\end{displaymath}

Here we are making the tacit assumption that our display window (your grandpa would call it a viewport) is square. That is, its aspect ratio a=1, where the aspect ratio of a rectangle is it height divided by its width. To account for non-square viewports, we'll use a viewing frustum given by

\begin{displaymath}-w\leq x \leq w,\quad -aw\leq y \leq aw,\quad 0\leq z \leq w.\end{displaymath}

Note we (usually) do not want to change aspect ratio from what we see to what is displayed. To do so would introduce effects that are not what is wanted. For example, it may change a circle into an ellipse.

Now clip space is chosen to make clipping algorithms as simple and efficient as possible. Jim Blinn [1] recommends using the region

\begin{displaymath}0\leq x \leq w,\quad 0\leq y \leq w,\quad 0\leq z \leq w\end{displaymath}

(or $0\leq x \leq 1,\quad 0\leq y \leq 1,\quad 0\leq z \leq 1$ after the homogeneous divide). We can map the viewing frustum to this region by a simple scale and translation To whit, in x

\begin{displaymath}-1 \rightarrow 0, \quad \mbox{and}\quad 1 \rightarrow 1,\end{displaymath}

in y

\begin{displaymath}-a \rightarrow 0, \quad \mbox{and}\quad a \rightarrow 1,\end{displaymath}

and in z

\begin{displaymath}0 \rightarrow 0, \quad \mbox{and}\quad 1 \rightarrow 1.\end{displaymath}

This can be accomplished by The $4\times 4$ homogeneous matrix that performs this is

\begin{displaymath}M=\left[\begin{array}{cccc}
\frac{1}{2} & 0 & 0 & 0 \\
0 & ...
... 0 \\
\frac{1}{2} & \frac{1}{2} & 0 & 1 \\
\end{array}\right]\end{displaymath}

This matrix is composed with the view-to-perspective map to bring us directly from view space into clipping space; no stopping in perspective space

\begin{displaymath}\left[\begin{array}{cccc}
\cos(\alpha) & 0 & 0 & 0 \\
0 & ...
...Q & \sin(\alpha) \\
0 & 0 & -Qz_n & 0 \\
\end{array}\right],\end{displaymath}

where $Q = \frac{z_f\sin(\alpha)}{z_f - z_n}$

An View to Clip Example

How does this work with values used for the variables? Some find answers to this type of questions useful; I find it an opportunity for arithmetic errors, but here goes anyway.

Let's pretend that our field of view is defined by $\alpha=\pi/4=45^{\circ}$, the near plane is given as zn=4, the far plane by zf=12, and the viewport we ultimately want to display upon has aspect ratio a=0.75 (the old TV standard).

We can plug these values into the matrix

\begin{displaymath}\left[\begin{array}{cccc}
\frac{\cos(\alpha)}{2} & 0 & 0 & 0 ...
...Q & \sin(\alpha) \\
0 & 0 & -Qz_n & 0 \\
\end{array}\right],\end{displaymath}

noting $\cos(\pi/4)=\sin(\pi/4)=\sqrt{2}/2$ and $Q=3\sqrt{2}/4$. We get

\begin{displaymath}\left[\begin{array}{cccc}
\frac{\sqrt{2}}{4} & 0 & 0 & 0 \\
...
...sqrt{2}}{2} \\
0 & 0 & -3\sqrt{2} & 0 \\
\end{array}\right].\end{displaymath}

Now we could run some view points through this transform and use our intuition to conclude it is correct (or incorrect), but it might be more insightful to run lines through the transformation to see how they are mapped.

Line Description
$(0,\,0,\,t)$ A line from eye through view frustum center
$(t,\,t,\,t)$ A line from eye through upper-right frustum corner
$(-3t/4,\,-3t/4,\,t)$ A line from eye through lower-left of frustum

The map is two steps: (1) run the points through the matrix; (2) perform a homogeneous divide.

1.
Step 1:

\begin{displaymath}\left[\begin{array}{cccc}
0 & 0 & t & 1 \\
t & t & t & 1 \\ ...
...-12)\sqrt{2}}{4} & \frac{\sqrt{2}t}{2} \\
\end{array}\right].
\end{displaymath}

2.
Step 2:

Conclusions

Let's summarize where we are: First we map objects from model space to view space by composing modeling transforms that place objects in the world with the view transform that presents the world as seen from a virtual camera (our cyclops eye point). In view space we perform illumination and maybe some hidden surface analysis. From view space we map into clip space using the transformation constructed above. Clip space is chosen to make the clipping algorithm we may use (there are many of them) as efficient as possible. Following Blinn [1] We have chosen to use $0\leq x \leq 1$, $0\leq y \leq 1$, $0\leq z \leq 1$as the clipping cube. To construct the map correctly we need to have a prevision of the shape of the viewport where the scene will be displayed. That is, we need to know the aspect ratio $a=\mbox{height}:\mbox{width}$ of the display window.

Another fact we noted was that points in the eye-plane can not be projected (we'll get a divide by zero error). A corollary of this is that objects behind the eye-plane will be inverted upon projection.

Bibliography

1
J. BLINN, A trip down the graphics pipeline: Line clipping, IEEE Computer Graphics and Applications, 11 (1991), pp. 98 - 105.



William Shoaff
2000-09-13