# GJK: Common Implementation Mistake

Thanks to some cool discussions with my friend Nathan Carlson I’ve learned a bit about GJK. There are two versions floating around in the computer science community, and one of them is much easier to implement than the other. I will assume readers have a basic understanding of the theory (general idea) of how GJK works before reading.

The easier style of GJK is colloquially referred to as “bastardized”, in that it isn’t truly the original GJK, but instead a modified version for boolean collision detection. The idea is to evolve through the volume of two convex polyhedra until an axis of separation is found. Since any axis of separation can work a few assumptions can be made during the implementation which can lead to interesting optimizations. Such optimizations, however, can lead to a common pitfall in the tetrahedron case.

Most implementations of the tetrahedron case will conduct 3 dot products and execute a series of branches, or a switch statement. When arriving at the tetrahedron case it is known the origin of Minkowski space is above (or below) a hyperplane defined by the previous triangular simplex. The three dot products are used to see which of the newly created face normals on the tetrahedron point towards the origin. The fatal assumption is that these three dot products alone can be used to deduce which Voronoi region contains the origin. There are 15 different voronoi regions for the tetrahedron case, and the three dot products alone can be used only to test the case of the tetrahedron containing the origin.

When imagining two 3D faces adjacent to one another, where the angle between each face normal is acute (or in other words the two faces are nearly coplanar), it is easy to notice that the dot product of a point above both faces can be positive. While positive a point can still be entirely within the Voronoi region of only one of the two faces, instead of within the Voronoi region of the shared edge. This means that the tetrahedron’s barycentric coordinates are not enough information to deduce which of the 15 voronoi regions the origin lays within.

If this mistake is made the simplex can become a triangle when one of the edges of this triangle is truly the closest feature to the origin. Similarly a face can be chosen when instead a vertex region should have been chosen. An entire iteration upon a triangle will be performed! What’s even worse is if the triangle iteration assumes that certain voronoi regions were checked the previous iteration and skips them, then this assumption may be invalid after a tetrahedron iteration.

So although incorrect, this mistake on its own will not cause GJK to fail but converge with more iterations. This mistake can also give rise infinite (and sometimes complex) cycling without ever converging (see comments below).

Share

# Transformations: Change of Basis Matrix

When dealing with tensors and having to transform them from one basis to another the formula to do so isn’t all that intuitive, especially when 3D programmers are used to transformation concatenations.

I’m sure any reader familiar with 3D would understand that two transformation matrices can be combined to make a new one with the same effect as the first two. This can look like so:

A * B = C
\label{eq1}

In the above equation the transformations present (A B and C) would be used to transform a single vector. It should noted that this article refers to vectors as points (or more clearly vectors translating the origin). In column major format this would mean that the a given transformation would transform the column of a single vector. Given a matrix A and a vector V, to transform V into V’ the following equation would be used:

A * V = V’
\label{eq2}

Similarly, if V needs to be transformed into the frame of A (which is different than being transformed by the frame of A), the following equation can be used:

A^{-1} * V = V’
\label{eq3}

Say a programmer needs to take a transformation matrix and see what this would look like within another coordinate frame. An example of this would be the need to rotate a rotation matrix, or perhaps transform an intertia tensor from local to world frame. In order to apply a change of basis upon a matrix the following equation must be used, where B is called the change of basis matrix:

A’ = B * A * B^{-1}
\label{eq4}

A Euclidean vector can be thought of as a linear combination of the Euclidean basis. Given a vector v:

V = \begin{bmatrix} x \\ y \\ z \end{bmatrix}
\label{eq5}

And Euclidean basis as a formation of three principal vector axes scaled by scaling factors i, j and k:

E^{3} = i\begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix}, j\begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix}, k\begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix}
\label{eq6}

The vector V can be written as a linear combination of E^3 by replacing the i, j and k components with V’s x, y and z components:

V = x\begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix}, y\begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix}, z\begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix}
\label{eq7}

Following this line of thinking, a 3×3 matrix for some arbitrary rotational transformation actually contains an orthonormal basis of three principal axes. The first column represents the x axis, the second column represents the y axis, and the third column represents the z axis of a given reference frame. Any vector in E^3 can be described by a different combination of i, j and k values in any new basis. Wikipedia actually has a beautiful picture to visualize a vector within two different bases simultaneously:

Here is a simple example of showing V as a combination of the columns of some arbitrary basis matrix:

V = x\begin{bmatrix} a \\ b \\ c \end{bmatrix}, y\begin{bmatrix} d \\ e \\ f \end{bmatrix}, z\begin{bmatrix} g \\ h \\ i \end{bmatrix}
\label{eq8}

Now back to equation \eqref{eq4}, in order to rotate a rotation it might be good to clearly define what this means. This means: given a rotation A it is desired to apply this rotation within the reference frame of B.

Say we have a rotation A that rotates about the x axis by θ. We would like to tilt A slightly, which would tilt the x axis slightly, and apply a rotation of θ about the new slightly tilted axis. The tilt rotation itself is given by B.

Just multiplying A by B, or B by A does not solve this problem. Multiplying A by B would translate to “rotate around the x axis, and then tilt a little bit”. Multiplying B by A would translate to: “Tilt a little bit, then spin around the x axis”. Neither of these are properly translating to “Rotating around the tilted x axis”. Here’s the translation of equation \eqref{eq4}: “Tilt the tilted axis so that it lays upon the x axis, then rotate about the x axis, and tilt once again”. This translation can also be stated in a slightly more formal manner: “Tilt A by inverse B (which aligns A’s x axis into the frame of B), rotate around B’s x axis, and tilt A back into it’s original frame”.

All that equation \eqref{eq4} is doing is transforming basis vectors of A into B’s frame, applying A, and transforming A back into A’s original reference frame. This matches the original clarified problem of “Given a rotation A it is desired to apply this rotation within the reference frame of B”. In more loose terms, this also means to rotate the rotation A by B.