COMP3421 Notes

Written by Luka Kerr on August 12, 2019

Introduction

CPU & GPU

CPU consists of a few cores optimized for sequential serial processing
GPU has a massively parallel architecture consisting of smaller special purpose cores designed for parallel work

Single Instruction, Multiple Thread (SIMT)

Each processor in the GPU has multiple threads, executing in lock step for a piece of code.

It combines single instruction, multiple data (SIMD) with multithreading.

OpenGL

OpenGL is a low-level 2D/3D graphics API. It is:

Free and open source
Cross platform
Highly optimised
Designed to use GPUs

OpenGL is used to:

Transfer data to the graphics memory
Draw primitive shapes using data

UNSWgraph

Coordinate System

By default the viewport is centred at (0, 0). The left boundary is at x = -1, the right at x = 1, the bottom at y = -1, and the top at y = 1.

Geometry

Tessellation

We can draw polygons by splitting them up into simpler shapes (typically triangles). One simple method is to use a triangle fan.

The steps to draw a polygon using a triangle fan are as follows:

Start with any vertex of the polygon and move clockwise or counter-clockwise around it
The first three points form a triangle. Any new points after that form a triangle with the last point and the starting point

Tessellation works for all simple convex polygons, and some concave ones. It can be drawn with GL_TRIANGLE_FAN.

2D Transformations

Model Transformation

A model transformation describes how a local coordinate system maps to the world coordinate system. Each object in a scene has its own local coordinate system.

Coordinate Frames

Each coordinate system has a coordinate frame. A coordinate frame has an origin, and the direction and scale of the x and y axes. Different transformations can be applied to coordinate frames including:

translate(float x, float y): move the origin of the coordinate frame
rotate(float degrees): rotate the coordinate frame
scale(float x, float y): scale the coordinate frame by x and y amounts

Vectors & Matrices

The sum of a point and a vector is a point: $P + \vec{v} = Q$.

The difference between two points is a vector: $\vec{v} = Q - P$.

The sum/difference of two vectors is a vector: $(u_1, u_2)^T \pm (v_1, v_2)^T = (u_1 \pm v_1, u_2 \pm v_2)^T$.

Magnitude: $|\vec{v}| = \sqrt{v^2_1 + v^2_2 + \cdots + v^2_n}$.

Normalisation (direction): $\hat{\vec{v}} = \dfrac{\vec{v}}{|\vec{v}|}$, and $|\hat{\vec{v}}| = 1$.

Dot product: $\vec{u} \cdot \vec{v} = u_1 v_1 + u_2 v_2 + \cdots + u_n v_n$.

Cross product: $\vec{a} \times \vec{b} = (a_2 b_3 - a_3 b_2, a_3 b_1 - a_1 b_3, a_1 b_2 - a_2 b_1)^T$.

Cross product magnitude is the area of the parallelogram formed by: $|\vec{a} \times \vec{b}| = |a| |b| \sin \theta$.

Matrix multiplication: $A_{m \times n} \cdot B_{n \times p} = C_{m \times p}$ where $C_{ij} = (a_{i1} \cdot b_{1j}) + (a_{i2} \cdot b_{2j}) + \cdots + (a_{in} \cdot b_{nj})$.

Scene Trees

Scene Tree

A scene tree is used to represent a complex scene. Each node in the scene tree represents an objet, and each edge represents the transformation to get from the parent object’s coordinate system to the child’s.

Each node in a scene tree computes its own coordinate frame, draws itself within the frame, and passes the frame down to its children. As such, when a node is transformed, all of its children move with it.

def drawTree(frame):
  # Compute new frame
  frame
    .translate()
    .rotate()
    .scale()
  
  draw() this object
  
  for child in children:
    child.drawTree(frame)

Scene Graph

A scene graph is a more general scene tree. It is a directed acyclic multi-graph.

Each node can have multiple parents
Multiple edges can go from parent to child
Shared nodes are drawn multiple times

Camera

Transformation Pipeline

We transform in 2 stages

\[P_{camera} \overset{view}{\longleftarrow} P_{world} \overset{model}{\longleftarrow} P_{local}\]

The model transform transforms points in the local coordinate system to the world coordinate system.

The view transform transforms points in the world coordinate system to the camera’s coordinate system.

If $P_{world} = Trans(Rot(Scale(P_{camera})))$ then the view transform is $P_{camera} = Scale^{-1}(Rot^{-1}(Trans^{-1}(P_{world}))).$

Coordinate Frames

A 2D coordinate frame is defined by

An origin $\phi$
2 axis vectors $i$ ($x$-axis) and $j$ ($y$-axis)

A point on the coordinate frame can be defined as a displacement from the origin, i.e. $P = p_1 i + p_2 j + \phi$.

Affine Transformations

Transformations between coordinate frames can be represented as matrices

\[\begin{pmatrix} q_1 \\ q_2 \\ 1 \end{pmatrix} = \begin{pmatrix} i_1 & j_1 & \phi_1 \\ i_1 & j_2 & \phi_2 \\ 0 & 0 & 1 \end{pmatrix} \begin{pmatrix} p_1 \\ p_2 \\ 1 \end{pmatrix}\]

Similarly for vectors

\[\begin{pmatrix} v_1 \\ v_2 \\ 0 \end{pmatrix} = \begin{pmatrix} i_1 & j_1 & \phi_1 \\ i_1 & j_2 & \phi_2 \\ 0 & 0 & 1 \end{pmatrix} \begin{pmatrix} u_1 \\ u_2 \\ 0 \end{pmatrix}\]

All affine transformations can be expressed as combinations of four basic types:

Translation
Rotation
Scale
Shear

2D Translation

To translate the origin to a new point $\phi$

\[\begin{pmatrix} q_1 \\ q_2 \\ 1 \end{pmatrix} = \begin{pmatrix} 1 & 0 & \phi_1 \\ 0 & 1 & \phi_2 \\ 0 & 0 & 1 \end{pmatrix} \begin{pmatrix} p_1 \\ p_2 \\ 1 \end{pmatrix}\]

2D Rotation

To rotate a point about the origin by $\theta$

\[\begin{pmatrix} q_1 \\ q_2 \\ 1 \end{pmatrix} = \begin{pmatrix} cos(\theta) & -sin(\theta) & 0 \\ sin(\theta) & cos(\theta) & 0 \\ 0 & 0 & 1 \end{pmatrix} \begin{pmatrix} p_1 \\ p_2 \\ 1 \end{pmatrix}\]

Likewise to rotate a vector by $\theta$

\[\begin{pmatrix} v_1 \\ v_2 \\ 0 \end{pmatrix} = \begin{pmatrix} cos(\theta) & -sin(\theta) & 0 \\ sin(\theta) & cos(\theta) & 0 \\ 0 & 0 & 1 \end{pmatrix} \begin{pmatrix} u_1 \\ u_2 \\ 0 \end{pmatrix}\]

2D Scale

To scale a point by factors $(s_x, s_y)$ about the origin

\[\begin{pmatrix} s_x p_1 \\ s_y p_2 \\ 1 \end{pmatrix} = \begin{pmatrix} s_x & 0 & 0 \\ 0 & s_y & 0 \\ 0 & 0 & 1 \end{pmatrix} \begin{pmatrix} p_1 \\ p_2 \\ 1 \end{pmatrix}\]

Likewise to scale a vector by $(s_x, s_y)$

\[\begin{pmatrix} s_x v_1 \\ s_y v_2 \\ 0 \end{pmatrix} = \begin{pmatrix} s_x & 0 & 0 \\ 0 & s_y & 0 \\ 0 & 0 & 1 \end{pmatrix} \begin{pmatrix} v_1 \\ v_2 \\ 0 \end{pmatrix}\]

2D Shear

To horizontally shear a point by a factor $h$

\[\begin{pmatrix} q_1 \\ q_2 \\ 1 \end{pmatrix} = \begin{pmatrix} 1 & h & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{pmatrix} \begin{pmatrix} p_1 \\ p_2 \\ 1 \end{pmatrix}\]

To vertically shear a point by a factor $v$

\[\begin{pmatrix} q_1 \\ q_2 \\ 1 \end{pmatrix} = \begin{pmatrix} 1 & 0 & 0 \\ v & 1 & 0 \\ 0 & 0 & 1 \end{pmatrix} \begin{pmatrix} p_1 \\ p_2 \\ 1 \end{pmatrix}\]

Decomposing Transformations

Every 2D affine transformation can be decomposed as $M = M_{translate} M_{rotate} M_{scale} M_{shear}$.

To decompose the transform, consider the matrix form

\[\begin{pmatrix} i_1 & j_1 & \phi_1 \\ i_2 & j_2 & \phi_2 \\ 0 & 0 & 1 \end{pmatrix}\]

Then, assuming uniform scaling and no shear

\[\begin{array}{lcl} M_{translate} = \begin{pmatrix}\phi_1 \\ \phi_2 \\ 1\end{pmatrix} \\ M_{rotate} = atan2(i_2, i_1) \\ M_{scale} = |i| \end{array}\]

where $atan2(i_2, i_1)$ is $atan(i_2 / i_1)$ or $tan^{-1}(i_2 / i_1)$ adujsting for $i_1$ being 0. If $i_1 = 0$ and $i_2 \ne 0$ we get $90^{\circ}$ if $i_2 > 0$, or $-90^{\circ}$ if $i_2 < 0$.

Inverse Transformations

Inverses can be calculated as follows

\[\begin{array}{lcl} M_{translate}^{-1}(d_x, d_y) & = & M_{translate}(-d_x, -d_y) \\ M_{rotate}^{-1}(\theta) & = & M_{rotate}(-\theta) \\ M_{scale}^{-1}(s_x, s_y) & = & M_{scale}(1 / s_x, 1 / s_y) \\ M_{shear}^{-1}(h) & = & M_{shear}(-h) \\ \end{array}\]

Linear Interpolation

\[\begin{array}{lcl} lerp(P, Q, t) & = & P + t(Q - P) \\ lerp(P, Q, t) & = & P(1 - t) + tQ \\ \end{array}\]

where

$P$ and $Q$ are points
$t \in [0, 1]$

Lines

Parametric Form

\[L(t) = P + t(Q - P)\]

Point Normal Form

\[\mathbf{n} \cdot (P - L) = 0\]

Line Intersection

Two lines

$L_{AB} = A + (B - A)t$
$L_{CD} = C + (D - C)u$

Solve the simultaneous equations $(B - A)t = (C - A) + (D - C)u$.

Polygon Intersection

To check if a point $P$ is in a polygon, we can fire a horizontal ray from $P$ and check how many times it crosses the edges of the polygon. If $n$ is the number of crosses and $n$ is even then $P$ is not in the polygon, if $n$ is odd then $P$ is in the polygon.

Shaders

Shaders are programs executed on the GPU for the purpose of rendering graphics, written in glsl.

Vertex Shaders

The GPU will execute the vertex shader for every vertex we supply it.

Variable types include

in - inputs to the shader and are different for each vertex
uniform - inputs to the shader that are the same for every vertex
out - what the shader outputs
vec2, vec3, vec4 - vectors of different size

Variables starting with gl_ and have special meaning.

// Incoming vertex position
in vec2 position;

// Takes the input position and returns it as is
void main() {
  gl_Position = vec4(position, 0, 1);
}

Fragment Shaders

The GPU will execute the fragment shader for every pixel it draws into the frame buffer.

out vec4 outputColor;

// Output black
void main() {
  outputColor = vec4(0, 0, 0, 0);
}

3D

Moving to 3D is simply a matter of adding an extra dimension to our points and vectors

\[P = \begin{pmatrix} p_x \\ p_y \\ p_z \\ 1 \end{pmatrix}, \quad v = \begin{pmatrix} v_x \\ v_y \\ v_z \\ 0 \end{pmatrix}\]

3D Affine Transformations

3D affine transformations have the same structure as 2D but have an extra axis $k$

\[M = \begin{pmatrix} i_1 & j_1 & k_1 & \phi_1 \\ i_2 & j_2 & k_2 & \phi_2 \\ i_3 & j_3 & k_3 & \phi_3 \\ 0 & 0 & 0 & 1 \end{pmatrix}\]

3D Translation

To translate the origin to a new point $\phi$

\[\begin{pmatrix} q_1 \\ q_2 \\ q_3 \\ 1 \end{pmatrix} = \begin{pmatrix} 1 & 0 & 0 & \phi_1 \\ 0 & 1 & 0 & \phi_2 \\ 0 & 0 & 1 & \phi_3 \\ 0 & 0 & 0 & 1 \end{pmatrix} \begin{pmatrix} p_1 \\ p_2 \\ p_3 \\ 1 \end{pmatrix}\]

3D Rotation

The rotation matrix depends on the axis of rotation. We can decompose any rotation into a sequence of rotations about the $x$, $y$ and $z$ axes.

\[\begin{pmatrix} q_{x_1} \\ q_{x_2} \\ q_{x_3} \\ 1 \end{pmatrix} = \begin{pmatrix} 1 & 0 & 0 & 0 \\ 0 & cos(\theta) & -sin(\theta) & 0 \\ 0 & sin(\theta) & cos(\theta) & 0 \\ 0 & 0 & 0 & 1 \end{pmatrix} \begin{pmatrix} p_1 \\ p_2 \\ p_3 \\ 1 \end{pmatrix}\] \[\begin{pmatrix} q_{y_1} \\ q_{y_2} \\ q_{y_3} \\ 1 \end{pmatrix} = \begin{pmatrix} cos(\theta) & 0 & sin(\theta) & 0 \\ 0 & 1 & 0 & 0 \\ -sin(\theta) & 0 & cos(\theta) & 0 \\ 0 & 0 & 0 & 1 \end{pmatrix} \begin{pmatrix} p_1 \\ p_2 \\ p_3 \\ 1 \end{pmatrix}\] \[\begin{pmatrix} q_{z_1} \\ q_{z_2} \\ q_{z_3} \\ 1 \end{pmatrix} = \begin{pmatrix} cos(\theta) & -sin(\theta) & 0 & 0 \\ sin(\theta) & cos(\theta) & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{pmatrix} \begin{pmatrix} p_1 \\ p_2 \\ p_3 \\ 1 \end{pmatrix}\]

3D Scale

To scale a point by factors $(s_x, s_y, s_z)$ about the origin

\[\begin{pmatrix} s_x p_1 \\ s_y p_2 \\ s_z p_3 \\ 1 \end{pmatrix} = \begin{pmatrix} s_x & 0 & 0 & 0 \\ 0 & s_y & 0 & 0 \\ 0 & 0 & s_z & 0 \\ 0 & 0 & 0 & 1 \end{pmatrix} \begin{pmatrix} p_1 \\ p_2 \\ p_3 \\ 1 \end{pmatrix}\]

3D Shear

To horizontally shear a point by a factor $h$

\[\begin{pmatrix} q_1 \\ q_2 \\ q_3 \\ 1 \end{pmatrix} = \begin{pmatrix} 1 & h & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{pmatrix} \begin{pmatrix} p_1 \\ p_2 \\ p_3 \\ 1 \end{pmatrix}\]

Depth

Winding Order

By default, triangles are defined with counter-clockwise vertices are processed as front-facing triangles. Clockwise are processed as back-facing triangles.

Back Face Culling

An optimisation called face culling allows non-visible triangles of closed surfaces (back faces) to be culled before rasterisation. This is based on the winding order of the triangle - front faces are CCW by default.

View Volume

A 3D camera shows a 3D view volume. This is the area of space that will be displayed in the viewport. Objects outside the view volume are clipped. The view volume is in camera coordinates.

Orthographic View Volume

The camera is located at the origin in camera co-ordinates and oriented down the negative z-axis.

Canonical View Volume

It is convenient for clipping if we scale all coordinates so that visible points lie within the range $(-1, 1)$. The origin is in the middle of the view volume and the z-axis is flipped.

Perspective Camera

We can define a different kind of projection that implements a perspective camera. The view volume is a frustum. A frustum consists of:

A left, right, bottom and top
A near plane from the camera to a near viewport
A far plane from the camera to a far away viewport

public Matrix4 frustum(float left, float right, float bottom, 
                       float top, float near, float far) {
  return new Matrix4(new float[] {
    2*near/(right-left), 0, 0, 0,
    0, 2*near/(top-bottom), 0, 0,
    (right+left)/(right-left), (top+bottom)/(top-bottom), -(far+near)/(far-near), -1,
    0, 0, -2*far*near/(far-near), 0
  });
}

It is sometimes easier to define a frustum with a field of view, aspect ratio and near and far planes:

public Matrix4 perspective(float fovy, float aspectRatio, float near, float far) {
  float halfHeight = (float) (near * Math.tan(Math.toRadians(fovy) / 2));
  
  return Matrix4.frustum(
    -aspectRatio * halfHeight, aspectRatio * halfHeight, 
    -halfHeight, halfHeight, near, far
  );
}

Assuming a symmetric view volume:

$\tan(FOV / 2) = \dfrac{height}{2 \times near}$
$width = height \times aspectRatio$

The Transformation Pipeline

To transform a point $P = (p_x, p_y, p_z)^T$:

Extend to homogenous coordinates $P = (p_x, p_y, p_z, 1)^T$
Multiply by model matrix to get world coordinates $P_w = M_{model} P$
Multiply by view matrix to get camera coordinates $P_c = M_{view} P_w$
Multiply by projection matrix to get CVV coordinates $P_{cvv} = M_p P_c$
Clip to remove points outside CVV
Perspective division to eliminate fourth component $P_n = \frac{1}{p_w} P_{cvv}$
Viewport transformation to window coordinates $P_v = M_{viewport} P_n$

Pseudodepth

Bilinear Interpolation

Given

A point $P$ with $(x, y)$ values
A point $Q_1$ with $(x_1, y_1, z_1)$, with $y_1 < y$
A point $Q_2$ with $(x_2, y_2, z_2)$, with $y_2 > y$
A point $Q_3$ with $(x_3, y_3, z_3)$, with $x_3 > x$
A point $Q_4$ with $(x_4, y_3, z_4)$, with $x_4 < x$

then

\[\begin{array}{lcl} f(R_1) = \dfrac{y - y_1}{y_2 - y_1} f(Q_2) + \dfrac{y_2 - y}{y_2 - y_1} f(Q_1) \\ \\ f(R_2) = \dfrac{y - y_3}{y_4 - y_3} f(Q_4) + \dfrac{y_4 - y}{y_4 - y_3} f(Q_3) \\ \\ f(P) = \dfrac{x - x_1}{x_2 - x_1} f(R_2) + \dfrac{x_2 - x}{x_2 - x_1} f(R_1) \end{array}\]

where $f(Q_n)$ gives the value $z_n$, or the depth, when computing pseudodepth.

Clipping

Cohen-Sutherland Algorithm

Used for line clipping on a rectangular clipping window
Divides a 2D space into 9 regions and then determines the lines and portions of lines that are visible in the viewport

Cyrus Beck Algorithm

Used with a convex polygon clipping window
More efficient than Cohen-Sutherland

Meshes

3D objects can be represented as polygonal meshes. Usually polygonal meshes are made up of a collection of triangles as they are easier to work with, although require more memory.

Mesh Data Structure

It is common to represent a polygon mesh in terms of lists:

Vertex list: all the vertices used in the mesh
Face list: each face’s vertices as indices into the vertex list

For example, given a cube we can construct a triangle mesh as follows

vertex_list = {
  { -1, -1,  1 }, // Vertex 0, with (x, y, z) coordinates
  {  1, -1,  1 }, // Vertex 1, with (x, y, z) coordinates
  {  1,  1,  1 }, // ...
  { -1 , 1,  1 },
  { -1, -1, -1 },
  {  1, -1, -1 },
  {  1,  1, -1 },
  { -1,  1, -1 }
};

face_list = {
  { 0, 1, 2 }, // Triangle face 0, made up of vertices 0, 1, 2
  { 2, 3, 0 }, // Triangle face 1, made up of vertices 2, 3, 4
  { 1, 5, 6 }, // ...
  { 6, 2, 1 },
  { 5, 4, 7 },
  { 7, 6, 5 },
  { 4, 0, 3 },
  { 3, 7, 4 },
  { 3, 2, 6 },
  { 6, 7, 3 },
  { 4, 5, 1 },
  { 1, 0, 4 }
}

`.ply` Format

The PLY format is a simple file format for representing polygon data.

The header specifies how many vertices and how many faces there are and what format they’re in. The body lists all the vertices and faces.

Lighting

Illumination

The colour of an object in a scene depends on

The colour and amount of light that falls on it
The colour and reflectivity of the object

There are two kinds of reflection we need to deal with, diffuse and specular.

Dull or matte surfaces exhibit diffuse reflection. Light falling on the surface is reflected uniformly in all directions. It does not depend on the viewpoint.

Polished surfaces exhibit specular reflection. Light falling on the surface is reflected at the same angle. Reflections will look different from different view points.

Components

Most objects will have both a diffuse and a specular component to their illumination. will also include an ambient component to cover lighting from indirect sources.

We will build a lighting equation $I(P) = I_{ambient}(P) + I_{diffuse}(P) + I_{specular}(P)$, where $I(P)$ is the amount of light coming from $P$ to the camera.

To calculate the lighting equation we need to know three important vectors

The normal vector $\mathbf{m}$ to the surface $P$
The view vector $\mathbf{v}$ from $P$ to the camera
The source vector $\mathbf{s}$ from $P$ to the light source

All three of these vectors need to be normalised.

If we have multiple lights, we add the ambient, diffuse and specular components from all of them:

\[I(P) = \sum_{l \in lights} I_{ambient}^l(P) + I_{diffuse}^l(P) + I_{specular}^l(P)\]

Diffuse Illumination

Diffuse scattering is equal in all directions so does not depend on the viewing angle. The amount of reflected light depends on the angle of the source of the light.

This can be formalised as:

\[I_{diffuse} = max(0, I_{si} \rho_d(\mathbf{s} \cdot \mathbf{m})\]

where

$I_{si}$ is the source intensity
$\rho_d$ is the diffuse reflection coefficient in $[0, 1]$

Specular Reflection

Phong Model

The Phong model is an approximate model to calculate the specular reflection value $\mathbf{r}$.

\[\begin{array}{rrcl} & \mathbf{r} & = & normalise(-\mathbf{s} + 2(\mathbf{s} \cdot \mathbf{m}) \mathbf{m}) \\ \therefore & I_{specular} & = & max(0, I_{si} \rho_{sp} (\mathbf{r} \cdot \mathbf{v})^f) \end{array}\]

where

$I_{si}$ is the source intensity
$\rho_{sp}$ is the specular reflection coefficient in the range $[0, 1]$
$f$ is the phong exponent, typically in the range $[1, 128]$

Blinn Phong Model

The Blinn Phong model uses a vector halfway between the source and the viewer instead of calculating the reflection vector.

\[\begin{array}{rrcl} & \mathbf{h} & = & normalise(\dfrac{\mathbf{s} + \mathbf{v}}{2}) \\ \therefore & I_{specular} & = & max(0, I_{si} \rho_{sp} (\mathbf{h} \cdot \mathbf{m})^f) \end{array}\]

where

$I_{si}$ is the source intensity
$\rho_{sp}$ is the specular reflection coefficient in the range $[0, 1]$
$f$ is the phong exponent, typically in the range $[1, 128]$

Ambient Light

Lighting with just diffuse and specular lights gives very stark shadows, whereas in reality shadows are not completely black as light is coming from all directions.

\[I_{ambient} = I_a \rho_a\]

where

$I_a$ is the ambient light intensity
$\rho_a$ is the ambient reflection coefficient in the range $(0, 1)$

Lights

Point Lights

Point lights have a point in the world from which we compute the source vector.

Directional Lights

Directional lights are so far away that the source vector is effectively the same everywhere.

Spotlights

A spotlight has a direction, a cutoff angle and a attenuation factor. The brightness can be calculated as:

\[I = I_{si} (\cos(\theta))^\epsilon\]

where

$I_{si}$ is the source intensity
$\theta$ is the angle
$\epsilon$ is the attenuation factor

Face Normals

Every vertex for a given face will be given the same normal.

To calculate the normal we can use either the cross product method or Newell’s method.

Cross Product Method

Works for any convex polygon, but vertices must be in counter-clockwise order.

Pick two non-parallel adjacent edges with vertices $P_{i - 1}, P_{i + 1}$ and calculate for point $P_i$:

\[\mathbf{n} = (P_{i + 1} - P_i) \times (P_{i - 1} - P_i).\]

private Vector3 normal(Point3D p1, Point3D p2, Point3D p3) {
  Vector3 a = p2.minus(p1);
  Vector3 b = p3.minus(p1);

  return a.cross(b);
}

Newell’s Method

A more robust approach to computing face normals for arbitrary polygons:

\[\begin{array}{lcl} n_x & = & \sum_{i = 0}^{N - 1} (y_i - y_{i + 1})(z_i + z_{i + 1}) \\ n_y & = & \sum_{i = 0}^{N - 1} (z_i - z_{i + 1})(x_i + x_{i + 1}) \\ n_z & = & \sum_{i = 0}^{N - 1} (x_i - x_{i + 1})(y_i + y_{i + 1}) \\ \end{array}\]

where $(x_N, y_N, z_N) = (x_0, y_0, z_0)$.

Vertex Normals

We keep a separate buffer of normals for a mesh, and calculate it by a weighted average of the face normals.

private void vertexNormals() {
  // Initialise the normals to the zero vector
  for (int i = 0; i < normals.capacity(); i++) {
    normals.put(i, 0, 0, 0);
  }

  // Add the face normals of all surrounding faces.
  for (int i = 0; i < indices.capacity() / 3; i++) {
    int index1 = indices.get(i*3);
    int index2 = indices.get(i*3 + 1);
    int index3 = indices.get(i*3 + 2);

    Point3D p1 = vertices.get(index1);
    Point3D p2 = vertices.get(index2);
    Point3D p3 = vertices.get(index3);

    Vector3 normal = normal(p1, p2, p3);

    normals.put(index1, normals.get(index1).translate(normal));
    normals.put(index2, normals.get(index2).translate(normal));
    normals.put(index3, normals.get(index3).translate(normal));
  }
}

Shading

Flat Shading

Flat shading simply shades the entire face the same colour. It does this by computing $I(P)$ for some $P$ on the face (usually the first vertex) and sets every pixel on the face that colour.

Flat shading is good for

Diffuse illumination
Flat surfaces with distant light sources
Non-realistic/retro rendering

Flat shading is bad for

Close light sources
Specular shading
Curved surfaces

Gouraud Shading

The lighting equation $I(P)$ is calculated for each vertex $P$ using an associated vertex normal.

We calculate fragment colours by bilinear interpolation on neighbouring vertices:

\[\begin{array}{lcl} colour(R_1) & = & lerp(V_1, V_2, \frac{y - y_1}{y_2 - y_1}) \\ colour(R_2) & = & lerp(V_3, V_4, \frac{y - y_3}{y_4 - y_3}) \\ colour(P) & = & lerp(R_1, R_2, \frac{x - x_1}{x_2 - x_1}) \\ \end{array}\]

where

$V_1, V_2, V_3, V_4$ are four vertices surrounding $P$
$R_1, R_2$ are two vertices on either side of $P$

Gouraud shading is good for

Curved surfaces
Close light sources
Diffuse shading

Gouraud shading is bad for

Specular shading

Phong Shading

The lighting equation is calculated per fragment rather than per vertex.

void main() {
  vec3 m_unit = normalize(m);

  // Compute the s, v and r vectors
  vec3 s = normalize(view_matrix * vec4(lightPos, 1) - viewPosition).xyz;
  vec3 v = normalize(-viewPosition.xyz);
  vec3 r = normalize(reflect(-s, m_unit));

  vec3 ambient = ambientIntensity * ambientCoeff;
  vec3 diffuse = max(0, lightIntensity * diffuseCoeff * dot(m_unit, s));
  vec3 specular = vec3(0);

  // Only show specular reflections for the front face
  if (dot(m_unit, s) > 0) {
    specular = max(0, lightIntensity * specularCoeff * pow(dot(r, v), phongExp));
  }

  vec3 intensity = ambient + diffuse + specular;

  outputColor = vec4(intensity, 1) * input_color;
}

Phong shading is good for

Specular shading
Diffuse shading

Phong shading is bad for

Slow hardware

Colour

We implement colour by having separate red, green and blue components for light intensities $I(P), I_{ambient}(P), I_{specular}(P)$ and reflection coefficients $\rho_a, \rho_d, \rho_{sp}$. The lighting equation is then applied three times, once for each colour.

Transparency

Alpha Blending

Given an $\alpha$ value and two colours, we can linearly interpolate and compute new $r^\prime, g^\prime, b^\prime, a^\prime$ values based on $\alpha$:

\[\begin{pmatrix} r^\prime \\ g^\prime \\ b^\prime \\ a^\prime \end{pmatrix} = \alpha \begin{pmatrix} r_{image} \\ g_{image} \\ b_{image} \\ a_{image} \end{pmatrix} + (1 - \alpha) \begin{pmatrix} r \\ g \\ b \\ a \end{pmatrix}\]

Alpha blending has some problems, including

Depends on the order that pixels are drawn
Need to draw transparent polygons after the polygons behind them
If using the depth buffer and you try to draw a transparent polygon before the objects behind it, the later objects will not be drawn

In summary, transparent polygons should be drawn in back-to-front order after opaque ones.

Curves

It is generally useful to express curves in parametric form:

\[\begin{pmatrix} p_1 \\ p_2 \\ p_3 \end{pmatrix} = P(t), t \in [0, 1]\]

Linear Interpolation

Linear interpolation is good for straight lines. It is of degree 1 and involves 2 control points $P_0, P_1$.

\[P(t) = (1 - t)P_0 + t P_1\]

Quadratic Interpolation

Quadratic interpolation is good for parabolas. It is of degree 2 and involves 3 control points $P_0, P_1, P_2$.

\[P(t) = (1 - t)^2 P_0 + 2t(1 - t) P_1 + t^2 P_2\]

Cubic Interpolation

Cubic interpolation is good for a variety of curves. It is of degree 3 and involves 4 control points $P_0, P_1, P_2, P_3$.

\[P(t) = (1 - t)^3 P_0 + 3t(1 - t)^2 P_1 + 3t^2(1 - t) P_2 + t^3 P_3\]

Bezier Curves

The above family of curves are known as Bezier curves. The have the general form:

\[P(t) = \sum_{k = 0}^m B_k^m (t) P_k\]

where

$m$ is the degree of the curve
$P_0 \dots P_m$ are the control points
$B_k^m$ is a Bernstein polynomial

Bernstein Polynomials

\[B_k^m(t) = \begin{pmatrix} m \\ k \end{pmatrix} t^k (1 - t)^{m - k}\]

where

\[\begin{pmatrix} m \\ k \end{pmatrix} = \frac{m!}{k! (m - k)!}\]

Tangents

The tangent vector to the curve at parameter $t$ is given by

\[\begin{array}{rcl} \frac{d P(t)}{dt} & = & \sum_{k = 0}^m \frac{d B_k^m (t)}{dt} P_k \\ & = & m \sum_{k = 0}^{m - 1} B_k^{m - 1} (t) (P_{k + 1} - P_k) \end{array}\]

Modelling

Extrusion

Extruded shapes are created by sweeping a 2D polygon along a line or curve.

We can extrude a polygon along a path by specifying it as a series of transformations:

\[\begin{array}{rcl} poly & = & P_0, P_1, \dots, P_k \\ path & = & M_0, M_1, \dots, M_n \end{array}\]

and at each point in the path, we calculate a cross section $poly_i = M_i P_0, M_i P_1, \dots, M_i P_k$.

Frenet Frame

We can get the curve values at various points $t_i$ and then build a polygon perpendicular to the curve at $C(t_i)$ using a Frenet frame.

We align the $k$ axis with the (normalised) tangent, and choose values of $i$ and $j$ to be perpendicular:

\[\begin{array}{rcl} \phi & = & C(t) \\ k & = & \hat{C^\prime}(t) \\ i & = & \begin{pmatrix}-k_2 \\ k_1 \\ 0\end{pmatrix} \\ j & = & k \times i \end{array}\]

To find the tangent ($k$) we can:

Use maths $\begin{array}{rcl} C(t) & = & (\cos(t), \sin(t), bt) \\ T(t) & = & normalise(-\sin(t), \cos(t), b) \end{array}$
Approximate $T(t) = normalise(C(t + 1) - C(t - 1))$

Revolution

A surface with radial symmetry can be made by sweeping a half cross-section around an axis.

Textures

Textures are a way to add detail to models without requiring too many polygons. A texture is basically a function that maps texture coordinates to pixel values:

\[T(s, t) = \begin{pmatrix} r \\ g \\ b \\ a \end{pmatrix}\]

where $(s, t)$ is a texture coordinate.

Textures are most commonly represented by bitmaps. Pixels on a texture are called texels.

Textures & Shading

The simplest way to approach textures is to replace illumination calculations with a texture look up: $I(P) = T(s(P), t(P))$.

A more common solution is to use the texture to modulate the ambient and diffuse reflection coefficients: $I(P) = T(s, t)[I_{ambient} + I_{diffuse}] + I_{specular}$.

Magnification

Normal bitmap textures have finite detail, and if we zoom in close we can see individual texels. To fix this we can:

Choose the nearest texel
Bilinear filter between the nearest four texels

Minification

A similar problem occurs when we zoom out, that is that multiple texels map to a single pixel. This is known as aliasing and occurs when samples are taken from an image at a lower resolution than repeating detail in the image.

MIP mapping

Mipmaps are precomputed low resolution versions of a texture. To use mipmaps, we simply choose the next smallest mipmap for the required resolution.

Filtering

Trilinear filtering

Use bilinear filtering to compute pixel values based on the next highest and the next lowest mipmap resolutions, and interpolate between these values depending on the desired resolution.

Anisotropic Filtering

If a polygon is on an oblique angle away from the camera, then minification may occur much more strongly in one dimension than the other. Anisotropic filtering treats the two axes independently.

Reflection Mapping

Generating reflections is expensive, but we can make a reasonable approximation with textures:

Generate a cube that encloses the reflective object
Place a camera at the centre of the cube and render the outside world onto the faces of the cube
Use this image to texture the object

Shadow Buffer

To take into account whether a light source is occluded by another object, we can keep a shadow buffer for each light source. It is similar to the depth buffer by recording the distance from the light source to the closest object in each direction.

Shadow rendering is usually done in multiple passes:

Render the scene from each light’s viewpoint capturing only $z$-info in shadow buffer
Render the scene from camera’s point of view, using the previously captured shadow buffers to modulate the fragments

Pros:

Provides realistic shadows
No knowledge or processing of the scene geometry is required

Cons:

More computation
Shadow quality is limited by precision of shadow buffer
Shadow edges are hard to compute

Light Mapping

If our light sources and large portions of the geometry are static then we can precompute the lighting equations and store the results in textures called light maps. This process is known as baked lighting.

Pros:

Sophisticated lighting effects can be computed at compile time, where speed is less of an issue

Cons:

Memory and loading times for many individual light maps
Not suitable for dynamic lights or moving objects
Potential aliasing effects

Rasterisation

Rasterisation is the process of converting lines and polygons represented by their vertices into fragments.

Bresenham’s Algorithm

Given two points $(x_0, y_0), (x_1, y_1)$, we can draw a line using only integer calculations:

int y = y_0;

int w = x_1 - x_0; 
int h = y_1 - y_0;
int F = 2 * h - w;

for (int x = x_0; x <= x_1; x++) {
  drawPixel(x, y);

  if (F < 0) {
    F += 2 * h;
  } else { 
    F += 2 * (h - w);
    y++;
  }
}

Polygon Filling

Shared Edges

Pixels on shared edges between polygons need to be drawn consistently regardless of the order the polygons are drawn, with no gaps. We adopt a rule that we do not draw rightmost or uppermost edge pixels.

Scanline Algorithm

// For every scanline
for (y = minY; y <= maxY; y++) {
  remove all edges that end at y
  
  for (Edge e : active) {
    e.x += e.inc;
  }
  
  add all edges that start at y - keep list sorted by x
  
  for (int i = 0; i < active.size; i += 2) {
    fillPixels(active[i].x, active[i + 1].x, y);
  }
}

Edge Table

The edge table is a lookup table indexed on the y-value of the lower vertex of the edge. Horizontal edges are not added. We store the the x-value of the lower vertex, the increment (inverse gradient i.e. $run / rise$) of the edge and the y-value of the upper vertex.

Active Edge List

We keep a list of active edges that overlap the current scanline. Edges are added to the list as we pass the bottom vertex. Edges are removed from the list as we pass the top vertex.

For each edge in the active edge list we store The x-value of its crossing with the current row (initially the bottom x-value), the increment (inverse gradient) and the y-value of the top vertex.

Antialiasing

There are two basic approaches to eliminating aliasing:

Prefiltering: compute the amount occupied by a pixel and set its value to that percentage
Postfiltering: taking samples at a higher resolution and then averaging

Particle Systems

Particles are usually represented as small textured quads or point sprites (single vertices with an image attached). They are billboarded, i.e. transformed so that they are always face towards the camera.

Particles are created by an emitter object and evolve over time, usually changing their properties.

GPU

Particle systems are well suited to implementation as vertex shaders, where particles are represented as individual vertices.

Ray Tracing

In ray tracing, objects are modelled as implicit forms and each pixel is computed by casting a ray and seeing which models it intersects.

Global Lighting

In reality, the light falling on a surface comes from everywhere. Methods that take this lighting into account are called global lighting methods.

There are two main methods for global lighting:

Raytracing models specular reflection and refraction
Radiosity models diffuse reflection

Both methods are computationally expensive and are rarely suitable for real-time rendering.

Pseudocode

for each pixel (x, y):
  v = P(x, y) - E
  hits = {}
  
  for each object obj in the scene:
    E' = M_inverse * E
    v' = M_inverse * v
    hits.add(obj.hit(E', v'))
  
  hit = h in hits with min time > 1
  
  if (hit is null):
    set (x, y) to background
  else:
    set (x, y) to hit.obj.color(R(hit.time))

Shadows

At each hit point we cast a new ray towards each light source. These rays are called shadow feelers.

Reflections

We can implement realistic reflections by casting further reflected rays.

Reflected rays can be reflected off many objects one after the other, and so we usually stop after a fixed number of reflections to avoid infinite recursion.

Transparency

We can model transparent objects by casting a second ray that continues through the object.

Illumination

The illumination equation is extended to include reflected and transmitted components, which are computed recursively:

\[I(P) = I_{ambient} + I_{diffuse} + I_{specular} + I(P_{reflection}) + I(P_{transparent}).\]

Refraction

To handle transparency appropriately we need to take into account the refraction of light. Light bends as it moves from one medium to another. The change is described by Snell’s Law:

\[\frac{\sin \theta_1}{c_1} = \frac{\sin \theta_2}{c_2}\]

where $c_1, c_2$ are the speeds of light in each medium.

For maximum realism, we should calculate different paths for different colours, since different wavelengths of light move at different speeds.

Extents

Extents are bounding boxes or spheres which enclose an object. Testing for a hit against a box or sphere is fast, and if the test succeeds, then we can proceed to test against the actual object.

Binary Space Partitioning (BSP)

A BSP tree divides the world into cells, where each cell contains a small number of objects. When casting a ray, we only want to visit the leaves of the tree that the ray passes through.

Traversal Algorithm Pseudocode

visit(E, v, node): (E eye) 
  if (node is leaf):
    intersect ray with objs in leaf
  else:
    if (E on left):
      visit(E, v, left)
      other = right;
    else:
      visit(E, v, right)
      other = left
    endif
  
    if (ray crosses boundary):
      E' = intersect(E, v, boundary)
      visit(E', v, other)
    endif
  endif

Volumetric Ray Tracing

We can also apply ray tracing to volumetric objects like smoke or fog or fire. Such objects are transparent, but have different intensity and transparency throughout the volume.

We represent the volume as two functions:

\[\begin{array}{rcl} C(P) & = & \text{colour at point} \ P \\ \alpha(P) & = & \text{transparency at point} \ P \end{array}\]

Sampling

We cast a ray from the camera through the volume and take samples at fixed intervals along the ray. We end up with $N + 1$ samples:

\[\begin{array}{rcl} P_i & = & R(t_{hit}, i \Delta t) \\ C_i & = & C(P_i) \\ \alpha_i & = & \alpha(P_i) \\ C_N & = & (r, g, b)_{background} \\ \alpha_N & = & 1 \end{array}\]

Alpha Compositing

We now combine these values into a single colour by applying the alpha-blending equation.

\[\begin{array}{ccl} C_N^N & = & C_N \\ \underbrace{C_N^i}_\text{Total colour at $i$} & = & \alpha_i \underbrace{C_i}_\text{Local colour at $i$} + (1 - \alpha_i) \underbrace{C_N^{i + 1}}_\text{Total colour at $i + 1$} \end{array}\]

We can write a closed formula for the colour from $a$ to $b$ as

\[C_b^a = \sum_{i = 1}^b \alpha_i C_i \prod_{j = a}^{i - 1} (1 - \alpha_j)\]

Colour

Colour Blending

We write $C = rR + gG + bB$ to indicate that colour $C$ is equivalent $r$ units of red, $g$ units of green and $b$ units of blue.

Complementary Colours

Colours that add to give white are called complementary colours.

RGB

Colour is expressed as the addition of red, green and blue components:

\[C(r, g, b) = rR + gG + bB.\]

CMY

CMY is a subtractive colour model, typically used in describing printed media. It uses cyan, magenta and yellow which are contrasting colours.

CMYK

CMYK is similar to CMY, but has an additional component K, which stands for ‘key’ and refers to black ink.

HSV

Colour is expressed as three values:

$H$: the hue as an angle from $0^\circ$ (red) to $360^\circ$ (red)
$S$: the saturation from 0 (grey) to 1 (full colour)
$V$: the brightness from 0 (black) to 1 (bright colour)

HLS

Colour is expressed in terms of its:

Hue: the colour of the dominant wavelength
Luminance: the total power of the light
Saturation: the purity of the light

CIE

The CIE standard, also known as the XYZ model, is a way of describing colours as a three dimensional vector $(x, y, z)$ with

\[\begin{array}{c} 0 \le x, y, z \le 1 \\ x + y + z = 1 \end{array}\]

Gamut

Any output device has a certain range of colours it can represent, which we call its gamut. We can depict this as an area on the CIE chart.

Rendering Intents

There are four standard rendering intents which describe approaches to gamut mapping.

Absolute Colourmetric

Map $C$ to the nearest point within the gamut. This approach distorts hues and does not preserve relative saturation.

Relative Colourmetric

Desaturate $C$ until it lies in the gamut. This approach maintains hues more closely but does not preserve relative saturation.

Perceptual

Desaturate all colours until they all lie in the gamut. This approach maintains hues and preserves relative saturation but removes a lot of saturation.

Saturation

Attempt to maintain saturated colours. This approach is not standardised.

Postprocessing

Postprocessing is the process of adding visual affects after the scene has already been rendered.

Frame Buffer Objects

The frame buffer OpenGL gives us by default is very limited in most implementations.

We can create our own frame buffer objects with:

int[] fbos = new int[1];
gl.glGenFramebuffers(1, fbos, 0);

In order to use the frame buffer we have to attach to it at least a color buffer and optionally a depth or other buffers.

Gaussian Blur

Uses a Gaussian distribution to combine each pixel with its neighbours. The pixel itself is weighted most, and nearby pixels weighted depending on their distance from it.

Bloom Filter

Combines multiple post-processing steps:

Extract bright regions
Blur bright regions
Draw original scene to screen
Draw blurred brightness on top

High Dynamic Range

By using frame buffer objects we can render into buffers with more color detail. To render to the screen, this colour information has to be mapped to the conventional 8 bits per channel. This process is referred to as tone mapping.

Different tone mappings can do things like reduce overall brightness or enhance certain colours.

Ambient Occlusion

Approximates indirect lighting by identifying small holes, crevices and gaps and making them darker. For each pixel that has already been rendered, calculate an occlusion factor based on the depth of it and the depth of surrounding pixels.

Deferred Shading

In deferred shading we render the scene into multiple buffers containing basic information (e.g. normals, colour, depth). Actual lighting calculations can then be deferred to a post processing step.

Splines

A spline is a smooth piecewise-polynomial function. The places where the polynomials join are called knots. A joined sequence of Bezier curves is an example of a spline.

A spline provides local control in the fact that a control point only affects the curve within a limited neighbourhood.

B-splines

We can generalise Bezier splines into a larger class called basis splines or B-splines. A B-spline of degree $M$ has equation:

\[P(t) = \sum_{k = 0}^L N_k^m (t) P_k\]

where $L$ is the number of control points, with $L > m$.

Uniform / Non-uniform

Uniform B-splines have equally spaced knots
Non-uniform B-splines allow knots to be positioned arbitrarily and even repeat
A multiple knot is a knot value that is repeated several times

Rational Bezier Curves

We can create a greater variety of curve shapes if we weight the control points:

\[P(t) = \frac{\sum_{k = 0}^m w_k B_k^m (t) P_k}{\sum_{k = 0}^m w_k B_k^m (t)}\]

where a higher weight draws the curve closer to that point.

Rational Bezier curves can exactly represent all conic sections, which is not possible with normal Bezier curves.

Rational B-splines

We can also weight control points in B-splines to get rational B-splines:

\[P(t) = \frac{\sum_{k = 0}^L w_k N_k^m (t) P_k}{\sum_{k = 0}^L w_k N_k^m (t)}.\]

Introduction

CPU & GPU

Single Instruction, Multiple Thread (SIMT)

OpenGL

UNSWgraph

Coordinate System

Geometry

Tessellation

2D Transformations

Model Transformation

Coordinate Frames

Vectors & Matrices

Scene Trees

Scene Tree

Scene Graph

Camera

Transformation Pipeline

Coordinate Frames

Affine Transformations

2D Translation

2D Rotation

2D Scale

2D Shear

Decomposing Transformations

Inverse Transformations

Linear Interpolation

Lines

Parametric Form

Point Normal Form

Line Intersection

Polygon Intersection

Shaders

Vertex Shaders

Fragment Shaders

3D

3D Affine Transformations

3D Translation

3D Rotation

3D Scale

3D Shear

Depth

Winding Order

Back Face Culling

View Volume

Orthographic View Volume

Canonical View Volume

Perspective Camera

The Transformation Pipeline

Pseudodepth

Bilinear Interpolation

Clipping

Cohen-Sutherland Algorithm

Cyrus Beck Algorithm

Meshes

Mesh Data Structure

.ply Format

Lighting

Illumination

Components

Diffuse Illumination

Specular Reflection

Phong Model

Blinn Phong Model

Ambient Light

Lights

Point Lights

Directional Lights

Spotlights

Face Normals

Cross Product Method

Newell’s Method

Vertex Normals

Shading

Flat Shading

Gouraud Shading

Phong Shading

Colour

Transparency

Alpha Blending

Curves

`.ply` Format