Deriving OBB to OBB Intersection and Manifold Generation

The Oriented Bounding Box (OBB) is commonly used all over the place in terms of simulation and game logic. Although often times OBBs aren’t actually bounding anything, and really just represent an implicitly defined box.

OBBs commonly come in 2D and 3D form, though there aren’t all that many resources on the internet describing how to derive the 3D case. The ability to write robust collision detection, and more importantly manifold generation (contact data needed to resolve the collision), relies on a good understanding of linear algebra and geometry.

Readers will need to know about affine and linear transformations, and have intuition about the dot and cross products. To learn these preliminaries I highly recommend the book “Essential Mathematics” by Jim Van Verth. I will also be assuming readers are familiar with the Separating Axis Theorem (SAT). If you aren’t familiar with SAT please do a short preliminary research on this topic.

The fastest test I’m aware of for detection collision between two OBBs, in either 2 or 3 dimensions, makes use of the Separating Axis Theorem. In both cases it is beneficial to transform one OBB into the space of the other in order to simplify particular calculations.

Usually an OBB can be defined something like this in C++:

6 Face Axes

As with collision detection against two AABBs, OBBs have x y and z axes that correspond to three different potential axes of separation. Since an OBB has local oriented axes they will most of the time not line up with the standard Euclidean basis. To simplify the problem of projecting OBB \(A\) onto the the axes of OBB \(B\), all computations ought to be done within the space of \(A\). This is just a convention, and the entire derivation could have been done in the space of \(B\), or even in world space (world space would have more complex, and slow, computations).

The translation vector \(t\) is defined as \(t = R_{a}^{T} * (C_b – C_a)\), where \(C_i\) denotes the center of an OBB, and \(R\) denotes the rotation of an OBB. \(t\) points from \(A\)’s center to \(B\)’s center, within the space of \(A\). The relative rotation matrix to transform from B’s frame to A’s frame is defined as \(C = R_{a}^{T} * R_b\). \(C\) can be used to transform \(B\)’s local axes into the frame of \(A\). In \(A\)’s own frame it’s local axes line up with the standard Euclidean basis, and can be directly computed with the half width extents \(e_a\).

To calculate the separation \(s\) along the local axis \(l\) from an OBB examine the following diagram and matching equation:

OBB_Projection

\begin{equation}
s = |t \cdot l| – (|a \cdot l| + |(C * b) \cdot l|)
\label{eq1}
\end{equation}

Due to symmetry the absolute value of a projection can be used in order to gather a value that can be used to represent a projection along a given direction. In our case we aren’t really interested in the sign of the projection and always want the projection towards the other OBB. I understand that newer programmers may have some trouble translating math into code, so here’s an example of eq.\eqref{eq1} as C++:

s is the separation along an axis. t is the translation vector from center of one box to another. l is the axis we are projecting onto (in this case A’s x axis).

9 Edge Axes

In 2D eq.\eqref{eq1} is all that is needed to detect overlap upon any given possible axis of separation. In 3D a new topological feature arises on the surface of geometry: edges. Because edges form entirely separate Voronoi regions from face surfaces they can also provide possible axes of separation. With OBBs it is fast to test all possible unique edge axes of separation.

Given two edges upon two OBBs in 3D space the vector perpendicular to both would be the possible axis of separation, and can be computed with a cross product. A possible 9 axes of separation arise from crossing each local axis of \(A\) with the each local axis of \(B\).

Since \(C\) is composed of a concatenation between \(R_{a}^{T}\) and \(R_b\), the rows of this matrix can be thought of as the local axes of \(B\) in the space of \(A\). When a cross product between two axes is required to test separation, the axis can be pulled directly out of \(C\). This is much faster than computing a cross product in world space between two arbitrary axis vectors.

Lets derive a cross product between \(A\)’s x axis and \(B\)’s z axis, all within the space of \(A\). In \(A\)’s space \(A\)’s x axis is \(\begin{Bmatrix}1 & 0 & 0\end{Bmatrix}\). In \(A\)’s space \(B\)’s z axis is \(\begin{Bmatrix}C_{20} & C_{21} & C_{22}\end{Bmatrix}\). The axis \(l\) is just the cross product of these two axes. Since \(A\)’s x axis is composed largely of the number \(0\) elements of \(C\) can be used to directly create the result of the cross product, without actually running a cross product function. We arrive to the conclusion that for our current case \(l = \begin{Bmatrix}0 & -C_{22} & C_{21}\end{Bmatrix}\). This particular \(l\) axis can be used in eq.\eqref{eq1} to solve for overlap.

Since the matrix \(C\) is used to derive cross product results the sign of the elements will be dependent on the orientation of the two OBBs. Additionally, the cross product will negate on of the terms in the final axis. Since eq.\eqref{eq1} requires only absolute values it may be a good idea to precompute abs(\(C\)), which takes the absolute value element by element. This computation can be interleaved between the different axis checks for the first 6 checks, only computed as needed. This may let an early out prevent needless computation.

The same idea can be used to derive this axis in the space of \(B\) by crossing \(B\)’s local z axis with \(A\)’s local x axis transformed to the space of \(B\). To do this it may be easy to use \(C^T\), which represents \(R_{b}^{T} * A\). More information about this can be found in my previous article about the properties of transposing a matrix concatenation.

Eventually all axes can be derived on paper and translated into code.

Manifold Generation

A collision manifold would be a small bit of information about the nature of a collision. This information is usually used to resolve the collision during physical simulation. Once an axis of least penetration is found the manifold must be generated.

For SAT manifold generation is quite straight forward. The entirety of the details are expressed very well by Dirk Gregorius’ great online tutorial from GDC 2013. The slides are openly available for free, and currently hosted here generously by Erin Catto. Readers are encouraged to familiarize themselves with the idea of reference and incident faces, as well as the idea of reference face side planes. For a 2D analogue I have provided my own slides on this topic.

There are only two different cases of collision that should be handled: edge to edge and something to face. Vertex to edge, vertex to face, and vertex to vertex resolutions are ill-defined. On top of this numerical imprecision makes detecting such cases difficult. It makes more sense in every way to skip these three cases and stick to the original two.

Something to Face

Something has hit the face of one of the OBBs. It can be anywhere from 1 to 4 vertices from the face of the opposing OBB. The entire intersecting volume between two OBBs does not need to be considered due to the nature of SAT and the axis of minimum penetration. Only two faces from each OBB need be considered for this case, and are by convention named the reference and incident faces, one from each OBB.

Finding the incident face given an axis of separation can be found in constant time with implicit geometry like an OBB; the problem reduces to that of finding a vector that most faces a given vector, or in other words the most negative dot product.

Once the incident face is identified it is clipped to the side planes of the reference face. There may be some tricks involving basis transformations here to simplify the clipping routine. In order to perform clipping the incident face vertices must be calculated. Given a face normal from an OBB all four face vertices can be computed with the normal and the half width extents of the OBB. In similar fashion the side planes of the reference face can also be directly computed.

Edge to Edge

The edge to edge case requires a clever constant-time computation of a supporting edge. A supporting edge is the edge most pointed to by a given axis of separation. The computation of this edge may be tricky to derive yourself, though there do exist online references. I suggest checking out Open Dynamics Engine (ODE) by Russell Smith, or the Cyclone Engine by Ian Millington for information on how to compute a supporting edge.

Once both supporting edges are calculated the closest two points between each edge can be calculated and used as a single contact point of which to resolve upon.

Numeric Stability

In all uses of OBB to OBB intersection cross products with nearly parallel vectors pose a problem. The best solution is to detect when any two axes from each OBB are parallel and skip all nine cross product axes. It can be visually shown that no false positives will come from skipping the cross product axes in this case.

A simpler solution, as proposed by Gottschalk ’00 in his paper “Collision Queries using Oriented Bounding Boxes”, which made its way into Ericson’s book “Real-Time Collision Detection”, is to bias all elements of \(C\) with a small epsilon to drive near-zero axes away from a degenerate case. For other axes the epsilon value should be tuned such that it doesn’t have much an impact upon results.

If using this collision routine to resolve collision during physical simulation certain tuning my be appropriate. If an iterative resolver is used it may be preferred to have slightly more consistent manifold generation, as opposed to exact manifold generation. For details I suggest researching works by Erin Catto and comments made by others around the Bullet Forums.

AABB to OBB

Hopefully by now readers realize that this whole time we’ve been simplifying the problem of OBB to OBB to the problem of AABB to OBB. When an actual AABB needs to be collided against an OBB simplifications can occur. Many simplifications revolve around the non-necessity to transform things into the basis of the AABB, since world coordinates should suffice.

One Axis Aligned (or Two?)

Some games have OBBs that only orient on some of their axes instead of all of them. One can immediately realize that no cross product checks need to be performed in any case, if two OBBs have orientation locked on the same axis. Other optimizations will occur as well, making some axis tests reduced to that of AABB overlap testing.

Conclusion

Collision detection is difficult. Writing robust collision detection code requires a good mathematical foundation as well as the ability to write efficient code. This slight mixture of fields of study may be difficult to learn all at once, but it all gets easier with time and practice and the rewards can be very fulfilling.

I’d like to thank my friend Nathan Carlson for teaching much of this information to me, and for his insightful discussions.

Here is a finished implementation: link to github file.

TwitterRedditFacebookShare

12 thoughts on “Deriving OBB to OBB Intersection and Manifold Generation

  1. Evan

    I do not understand all the rotation matrices and my libraries do it for me, so how would I use this for just a set of 8 points that have already been rotated?

    Reply
    1. Randy Gaul Post author

      As far as I know you’d have to use an algorithm that works with only an array of points. GJK would work, but could not give you enough information to resolve collision (it only tells you if there was a collision at all). Maybe you can find a GJK implementation to use. Otherwise you definitely need to learn 3D math!

      Reply
  2. Evan

    Yes thats exactly my problem, I have working gjk code already but I dont understand EPA and how to implement it, which is why Im here trying to learn SAT. I understand some of what you say but I have not taken things like linear algebra yet.

    Can this class work in SAT?
    public OBB(float x, float y, float z, float angleX, float angleY,
    float angleZ, float sizeX, float sizeY, float sizeZ) {
    this.setX(x);
    this.setY(y);
    this.setZ(z);

    this.angleX = angleX;
    this.angleY = angleY;
    this.angleZ = angleZ;

    this.sizeX = sizeX;
    this.sizeY = sizeY;
    this.sizeZ = sizeZ;

    this.minX = new Vector3f(this.x – this.sizeX, y, z);
    this.minY = new Vector3f(x, this.y – this.sizeY, z);
    this.minZ = new Vector3f(x, y, this.z – this.sizeZ);

    this.maxX = new Vector3f(this.x + this.sizeX, y, z);
    this.maxY = new Vector3f(x, this.y + this.sizeY, z);
    this.maxZ = new Vector3f(x, y, this.z + this.sizeZ);

    calcRotate();//These are just functions that rotate the cube
    calcBRotate();//
    }

    Reply
    1. Randy Gaul Post author

      Something like that would probably work. Unless you find some code to use that someone else wrote you really won’t be able to write much yourself without understanding the math. That’s why I stressed this point so much in the beginning of the post. You can try referring to this for help (collision detection + resolution of OBBs): https://github.com/RandyGaul/qu3e . Hope this helps!

      Reply
  3. Akshay

    I am trying to write a SAT test for OBB vs OBB.

    I do not understand how

    s=|t⋅l|–(|a⋅l|+|(C∗b)⋅l|)

    translates to

    float s = dot( t, l ) – (e.x + dot( C.Column0( ), l ));

    Its not clear what a and b are. They seem to be vectors pointing to corners but which corners? Also unless I’m missing something, you are not using the extents of b at all in the computation which is strange.

    I think it would help if in the illustration, the A OBB was aligned to the axis being checked (in this case ‘x’) instead of the arbitrary axis ‘l’ .

    Reply
    1. Randy Gaul Post author

      a and b are the appropriate vectors to project along an axis. In general, they can be thought of as a vector from OBB center, pointing to a vertex that is closest to the other OBB along a given axis. The picture is just an illustration of one project along one axis. When we do dot( t, l ) – (e.x + dot( C.Column0( ), l )), l is our axis, e.x would be a, and dot( C.Column0( ), l ) would be b.

      Reply
      1. Akshay

        But how do we find a and b?

        Also if I’m not wrong, when you take e.x as a, that means l is the x-axis in the frame of a. Only then will e.x will be a.  Is this right?

        My second confusion is that  dot( C.Column0( ), l ) does not take into account the extents of b which seems counter-intuitive. C is just a multiplication of rotation matrices.

         

        Reply
        1. Randy Gaul Post author

          a and b are derived offline on paper for each axis. Then the code is written for each axis. And yes, e.x means that calculations are done in the frame of A, while the origin is the center of A. Your second comment is right, that needs to be scaled by the extent of B! I seemed to have forgotten that term. I’ll update the post.

          Reply
  4. Akshay

    I noticed you updated the equation.

    float s = dot( t, l ) – (A.e.x + dot( C.Column0( ), B.e ));

    When taking the dot product how do you make sure that B.e points in the right direction?

    Ideally B.e should be the b vector in the illustration but in reality B.e could be pointing to an adjacent vertex. In this case the dot product would be incorrect?

    Reply
    1. Randy Gaul Post author

      We are in A’s frame. To project B onto an A’s x axis we want B’s orientation in A’s frame, called C. C’s rows represent B’s orientation axes in the frame of A. This means that the first column of C are the x components of all three of B’s orientation axes. If we scale this first column by B.e, then we have all three of B’s orientation’s x components scaled by B.e. If we sum these three scaled x components, we are left with a signed projection. The sign determines if we are pointing towards or away from the origin, and the sign can be different for each x component.

      To simplify things we instead use absC = Abs( C ). To project B onto A’s x axis we now do: Dot( B.e, absC.Column0( ) ). Since the result is always positive it can always represent our rB vector (in the diagram) that is pointing in the right direction. So your question is a good one! I intentionally left out the Abs trick as an exercise to the reader, but since you asked such a good question I’ll just add it in.

      Note: Usually it’s best to write code in *row major* format, since in memory you darn better have row major format, as all accelerated hardware expects this. This means the basis vectors of a rotation matrix are the *rows*, in memory, of a rotation matrix. So when we do absC.Column*( ), we are taking all of the x, y or z components of all three basis vectors depending on what * is (x, y, or z).

      Reply
  5. Michael

    Thanks a lot for these great explanations. I have a question concerning the non-penetration case, which is often of interest when dealing with collision avoidance in control.

    Do you know some efficient way to make use of the Separating Axes in finding out the closest points of disjoint OBBs?

    Reply
    1. Randy Gaul Post author

      Yeah I’ve implemented SAT to find closest points between two OBBs. However I do recommend just using GJK for this operation. If you cannot implement GJK or do not have access to GJK source code, implementing a SAT style routine may be a good option. The general idea is to compute the axis of minimum penetration (like in the colliding case), and then compute closest points between these two axes. The OBB features can be two faces, or two edges, that define the axis of minimum penetration. The SAT implementation will probably be slower than a good GJK implementation.

      Reply

Leave a Reply

Your email address will not be published. Required fields are marked *