Here’s the link to Ericson’s blog post on tolerances: Tolerances Revisited. Please note there’s a little bit of ambiguity about whether the test should care if the vectors point in the same direction or not. In general it doesn’t really matter since the point of the slides is numeric robustness.

The gist of the slides is that the scale of the vectors in question matters for certain applications. Computing a good relative epsilon seems difficult, and maybe different epsilon calculations would be good for different applications. I’m not sure!

Here’s a demo you can try out yourself to test two of the solutions from the slides (tests returning true should be considered false positives):

#include <cstdio> #include <cmath> struct Vec3 { float x; float y; float z; Vec3 operator*( float a ) { Vec3 v; v.x = x * a; v.y = y * a; v.z = z * a; return v; } }; float Dot( Vec3 a, Vec3 b ) { return a.x * b.x + a.y * b.y + a.z * b.z; } Vec3 Normalize( Vec3 a ) { return a * (1.0f / std::sqrt( Dot( a, a ) )); } bool Parallel0( Vec3 a, Vec3 b, float tol ) { a = Normalize( a ); b = Normalize( b ); float dot = 1.0f - std::abs( Dot( a, b ) ); return dot < tol; } bool Parallel1( Vec3 a, Vec3 b, float tol ) { if ( Dot( a, b ) < 0 ) { b.x = -b.x; b.y = -b.y; b.z = -b.z; } float k = a.y / b.y; b = b * k; float x = std::abs( a.x - b.x ); float y = std::abs( a.y - b.y ); float z = std::abs( a.z - b.z ); if ( x < tol && y < tol && z < tol ) return true; return false; } void Tolerance( float tol ) { Vec3 u; u.x = 0.0f; u.y = 0.9812398512f; u.z = 0.001f; u = Normalize( u ); Vec3 v; v.x = 0.0f; v.y = 10.0f; v.z = 1.0f; printf( "Tolerance: %f\n", tol ); bool result = Parallel0( u, v, tol ); printf( "Parallel0: %s\n", result ? "true" : "false" ); result = Parallel1( u, v, tol ); printf( "Parallel1: %s\n", result ? "true" : "false" ); printf( "\n" ); } int main( ) { Tolerance( 1.0e-1f ); Tolerance( 1.0e-2f ); Tolerance( 1.0e-3f ); }

Output of the program is:

Tolerance: 0.100000 Parallel0: true Parallel1: true Tolerance: 0.010000 Parallel0: true Parallel1: false Tolerance: 0.001000 Parallel0: false Parallel1: false]]>

]]>

The idea is to take an octohedron, or icosahedron, and subdivide the triangles on the surface of the mesh, such that each triangle creates 4 new triangles (each triangle creates a Triforce symbol).

One thing the stack overflow page didn’t describe is intermittent normalization between each subdivision. If you imagine subdividing an octohedron over and over without any normalization, the final resulting sphere will have triangles that vary in size quite a bit. However, if after each subdivision every single vertex is normalized then the vertices will snap to the unit sphere more often. This results in a final mesh that has triangles of closer to uniform geodesic area.

The final mesh isn’t purely geodesic and there will be variation in the size of the triangles, but it will be hardly noticeable. Sphere meshes will look super nice and also behave well when simulating soft bodies with Matyka’s pressure volume.

Here’s an example program you can use to perform some subdivisions upon an octohedron (click here to view the program’s output):

#include <cstdio> #include <cmath> #include <cassert> struct Vec3 { Vec3( ) { } Vec3( float x0, float y0, float z0 ) { x = x0; y = y0; z = z0; } float x; float y; float z; const Vec3 operator+( const Vec3& a ) { Vec3 v; v.x = x + a.x; v.y = y + a.y; v.z = z + a.z; return v; } const Vec3 operator-( const Vec3& a ) { Vec3 v; v.x = x - a.x; v.y = y - a.y; v.z = z - a.z; return v; } const Vec3 operator*( float a ) const { Vec3 v; v.x = x * a; v.y = y * a; v.z = z * a; return v; } }; float Dot( Vec3 a, Vec3 b ) { return a.x * b.x + a.y * b.y + a.z * b.z; } inline const Vec3 Cross( const Vec3& a, const Vec3& b ) { return Vec3( (a.y * b.z) - (b.y * a.z), (b.x * a.z) - (a.x * b.z), (a.x * b.y) - (b.x * a.y) ); } const Vec3 Normalize( Vec3 v ) { return v * (1.0f / std::sqrt( Dot( v, v ) )); } const int stackSize = 1024 * 8; Vec3 in[ stackSize ]; int ip = 0; Vec3 out[ stackSize ]; int op = 0; Vec3 octohedron[ 6 ]; void Subdivide( void ) { op = 0; for ( int i = 0; i < ip; i += 3 ) { Vec3 a = in[ i ]; Vec3 b = in[ i + 1 ]; Vec3 c = in[ i + 2 ]; Vec3 ab = (a + b) * 0.5f; Vec3 bc = (b + c) * 0.5f; Vec3 ca = (c + a) * 0.5f; out[ op++ ] = b; out[ op++ ] = bc; out[ op++ ] = ab; out[ op++ ] = c; out[ op++ ] = ca; out[ op++ ] = bc; out[ op++ ] = a; out[ op++ ] = ab; out[ op++ ] = ca; out[ op++ ] = ab; out[ op++ ] = bc; out[ op++ ] = ca; assert( op <= stackSize ); } for ( int i = 0; i < op; ++i ) in[ i ] = Normalize( out[ i ] ); ip = op; } int main( ) { octohedron[ 0 ] = Vec3( 1.0f, 0.0f, 0.0f ); octohedron[ 1 ] = Vec3( 0.0f,-1.0f, 0.0f ); octohedron[ 2 ] = Vec3(-1.0f, 0.0f, 0.0f ); octohedron[ 3 ] = Vec3( 0.0f, 1.0f, 0.0f ); octohedron[ 4 ] = Vec3( 0.0f, 0.0f, 1.0f ); octohedron[ 5 ] = Vec3( 0.0f, 0.0f,-1.0f ); in[ ip++ ] = octohedron[ 2 - 1 ]; in[ ip++ ] = octohedron[ 1 - 1 ]; in[ ip++ ] = octohedron[ 5 - 1 ]; in[ ip++ ] = octohedron[ 3 - 1 ]; in[ ip++ ] = octohedron[ 2 - 1 ]; in[ ip++ ] = octohedron[ 5 - 1 ]; in[ ip++ ] = octohedron[ 4 - 1 ]; in[ ip++ ] = octohedron[ 3 - 1 ]; in[ ip++ ] = octohedron[ 5 - 1 ]; in[ ip++ ] = octohedron[ 1 - 1 ]; in[ ip++ ] = octohedron[ 4 - 1 ]; in[ ip++ ] = octohedron[ 5 - 1 ]; in[ ip++ ] = octohedron[ 1 - 1 ]; in[ ip++ ] = octohedron[ 2 - 1 ]; in[ ip++ ] = octohedron[ 6 - 1 ]; in[ ip++ ] = octohedron[ 2 - 1 ]; in[ ip++ ] = octohedron[ 3 - 1 ]; in[ ip++ ] = octohedron[ 6 - 1 ]; in[ ip++ ] = octohedron[ 3 - 1 ]; in[ ip++ ] = octohedron[ 4 - 1 ]; in[ ip++ ] = octohedron[ 6 - 1 ]; in[ ip++ ] = octohedron[ 4 - 1 ]; in[ ip++ ] = octohedron[ 1 - 1 ]; in[ ip++ ] = octohedron[ 6 - 1 ]; Subdivide( ); Subdivide( ); FILE* fp = fopen( "out.txt", "w" ); for ( int i = 0; i < ip; i += 3 ) { Vec3 a = in[ i ]; Vec3 b = in[ i + 1 ]; Vec3 c = in[ i + 2 ]; fprintf( fp, "%7.4f, %7.4f, %7.4f,\n%7.4f, %7.4f, %7.4f,\n%7.4f, %7.4f, %7.4f,\n\n", a.x, a.y, a.z, b.x, b.y, b.z, c.x, c.y, c.z ); } fprintf( fp, "%d\n", op / 3 ); for ( int i = 0; i < ip; i += 3 ) { Vec3 a = in[ i ]; Vec3 b = in[ i + 1 ]; Vec3 c = in[ i + 2 ]; Vec3 n = Normalize( Cross( b - a, c - a ) ); fprintf( fp, "%7.4f, %7.4f, %7.4f,\n", n.x, n.y, n.z ); } fclose( fp ); }]]>

Given the inertia tensor of a cylinder and sphere the inertia tensor of a capsule can be calculated with the help of the parallel axis theorem. The parallel axis theorem can shift the origin that an inertia tensor is defined relative to, given just a translation vector. We care about the tensor form of the parallel axis theorem since it is probably easiest to understand from a computer science perspective:

J is the final transformed inertia. I is the initial inertia tensor. m is the mass of the object in question. R is the translation vector to shift with. E3 is the standard basis, or the identity matrix. The cross symbol is the outer product (see next paragraph).

Assuming readers are familiar with the dot product and outer product, computing the change in an inertia tensor isn’t too difficult.

The center of mass of a hemisphere is 3/8 * radius above the center of the spherical base. Knowing this and understanding the parallel axis theorem, the inertia tensor of one of the hemispheres can be easily calculated. My own derivation brought me the conclusion that the inertia tensor of capsule is:

float r = radius; float r2 = r * r; float l = length; // length is the height of the cylinder float l2 = l * l; float mass = 1/2 * volume_sphere * density // Parallel axis sphere to its own COM, then translate to the // end of the capsule float x = mass * ( 3/8 r + l/2 )^2 - mass * ( 3/8 r )^2 // The above can be simplified to float x = mass * [(3 r + 2 l) / 8] * l float z = x float y = 2/5 * mass * r2; // Final inertia tensor I for a single shifted hemisphere [ x, 0, 0 ] [ 0, y, 0 ] [ 0, 0, z ] // We can add I with the tensor J, where J is the inertia of a cylinder final = J + 2 * I

Please note the final inertia tensor assumes the capsule is oriented such that the cylinder is aligned along the y axis. A rotation matrix R can be used to transform the final inertia tensor I into world space like so: R * I * R^T. To learn why this form is used read this post.

Special thanks to Dirk Gregorius for emailing me with an error in the original draft of this post! He kindly provided the public with a nice document showing his derivation of the inertia tensor of a capsule.

Following suit to identify the source of my error, I ended up writing my own derivation down in PDF format.

]]>In the past I frequented a website called StarEdit.net in order to learn how to make scenario files for the game Starcraft: Brood War (SCBW). The scenario files could be generated with an editor, and had a simple trigger system. A lot of fun can be had making small games within these scenario files, and the games can be played over Blizzard’s Battle.net servers.

Blizzard’s stock editor was very limiting so some fan-made solutions popped up over the years. Creating triggers in these editors involved manually navigating a point-and-click GUI. The GUI was super cumbersome to use, and ended up being unruly for large numbers of triggers.

In an effort to revisit some fond memories of creating SCBW scenarios I decided to create a tool for generating triggers en masse (windows only). This post explains the details of how to use LIT.

Lua is a pretty good choice for creating some kind of tool to generate SCBW triggers. Since all that is needed from my tool is to generate some text, creating a super small and simple Lua API doesn’t take much development time, and can be very useful to use when making scenario files.

LIT is comprised of a small executable that embeds the Lua language, and a few Lua files to represents the LIT api.

I’ve written a short post here on StarEdit.net about how to setup and run LIT from the command line.

Since a SCBW trigger involves conditions and actions, using LIT involves laying out conditions and actions. There are two main resources for learning how to use LIT and they both come in the downloaded ZIP file. Inside of *ActionsExample.lua* and *ConditionsExample.lua* I’ve written out a short demonstration for how to use each unique condition and action in SCBW.

In order to learn how to use any particular condition or action, just consult these two example files.

However, since LIT is written with Lua the entire Lua language can be used to generate SCBW triggers! Since anyone that is interested in using LIT will probably not know how to write Lua code, I recommend reading some simple tutorial on using Lua before getting started. Here’s a decent looking one.

Lets take a look at writing a Binary Countoff using LIT:

-- Create a player group for creating triggers p = PlayerGroup( 1, 2, 3 ) -- Create a death counter d = Deaths( ) d:SetUnit( "Terran Marine" ) d:SetPlayer( 8 ) -- Create another death counter d2 = Deaths( ) d2:SetUnit( "Terran Marine" ) d2:SetPlayer( 7 ) -- Generate the binary countoff triggers with a loop i = 1 exponent = 1 while i < 12 do d:SetCount( exponent ) d2:SetCount( exponent ) p:Conditions( ) d:AtLeast( ) p:Actions( ) d:Subtract( ) d2:Add( ) i = i + 1 exponent = exponent * 2 end

The triggers generated by running this Lua file with LIT are: click to view.

As you can see there’s a little bit of state stored in a couple *Deaths* objects. That state is a number, a player and a unit. Using this state text is output in a straightforward manner.

My favorite part about LIT is the *include* function! When writing out a LIT file, another entire LIT file can be included into it. Look at this example:

p = PlayerGroup( 1, 2, 3 ) d = Deaths( ) d:SetUnit( "Terran Marine" ) d:SetPlayer( 8 ) d2 = Deaths( ) d2:SetUnit( "Terran Marine" ) d2:SetPlayer( 7 ) -- Copy paste ALL of the AnotherFile.lua right here include( "AnotherFile.lua" )

As the comment says, the *include* function will copy paste an entire Lua file straight into the spot where the *include* function is. A file included into another file can also include files. Files can be included through different folders, and the *include* function supports relative paths.

Lets assume that *AnotherFile.lua* holds the rest of the code from the first Binary Countoff example. If this is the case, then the output triggers will be exactly the same!

Say we have a file *A* and a file *B*. If *A* includes *B* and *B* includes *A*, then LIT will crash. Similarly, if any chain of files all form a circular inclusion, LIT will crash.

This lets users of LIT organize their Lua files in any manner they wish. One of the biggest drawbacks of using a GUI editor to create triggers, is once a couple thousand triggers exist in a single scenario file it becomes nearly impossible to efficiently navigate them. Using your operating system’s folders and file structure, organizing LIT is simple and powerful.

Here’s a link to download a demonstration map. The map contains a concept of screen-sized rooms that the player can move around with. It’s implemented with a few burrowed units on each room. The download link comes with the map file and the LIT files used to generate the map’s triggers. The most interesting file is RoomLogic.lua.

]]>For example certain hardware doesn’t have virtual memory support, or the virtual memory support can be quite lacking. A lack of virtual memory means raw allocations from the OS return real addresses to the hardware RAM. Usually virtual memory can alleviate some effects of memory fragmentation through a level of indirection, though when dealing with physical memory yourself no such alleviation exists.

This is just one example of how a software memory manager can be written and used to control memory fragmentation in a way that makes sense for the application.

There are a few main types of allocators that I myself have found pretty useful: paging, stack and heap based allocations. Each one makes specific assumptions about the types of allocations and how the memory ought be used. Due to these assumptions significant performance boosts can be reaped in ways that may not have been realistic with raw operating system allocations.

My favorite type of allocation involves the use of a simple stack. The idea is to make one large call to *malloc* or *new* and hold this piece of memory. The *Stack* itself just holds a pointer to this large chunk of memory, and an integer representing an index into the stack with an element size in bytes.

Here is what a *Stack* implementation might look like (in pseudo code):

class Stack { public: Stack( ) { m_memory = malloc( STACK_SIZE ); m_index = 0; } ~Stack( ) { assert( m_index == 0 ); free( m_memory ); } void* Allocate( int size ); Free( void* data, int size ); private: byte* m_memory; int m_index; };

Allocation can work by moving the *m_memory* pointer forward in the stack. Deallocation can work by moving the *m_memory* pointer backwards in the stack. Notice that the *Free* function requires the user to pass back in the size of the allocation! This can be avoided by storing this *size* parameter from *Allocate* inside of the *m_memory* array itself, just before the location of the returned address. Upon deallocation this value can be retrieved by moving the *data* parameter of *Free* back in memory by 4 bytes.

The advantage of the stack allocator is that it’s extremely fast and dubiously simple to implement. The limitation is that deallocations must be performed in the reverse order of allocations, since the stack itself is in LIFO order. This makes the use cases for the stack allocator pretty limited. Usually resources, like images, level files, sounds, models, etc. can be loaded into memory with a stack based allocator. Anything that has a very clear and non-variable lifespan should be able to be allocated on a stack.

One last trick is that the last allocation can be trivially resized! Often times an algorithm will require a lot of temporary scratch memory to perform some calculations, or store some state. An initial guess as to how much memory is needed can often be calculated as the worst-case scenario. Once an algorithm finishes this scratch memory can be reduced to the size actually used, if it is the last allocation on the stack. Resizing the last stack allocation involves moving the index backwards in memory.

Implementing your own heaps is pretty similar to the stack based allocator. A heap allocator will use the operating system to allocate a large chunk of memory. Subsequent calls to the heap’s *Allocate* and *Free* methods will just dip into this chunk and fetch a piece.

The heap is more versatile and general purpose than a stack allocator. The heap can be implemented with a linked list of nodes. Each node represents a piece of memory. A node can either be allocated or free. To keep track of these linked list pointers, allocation state, and size of the memory block some memory itself is required! This stuff can be stored in a separate array, or right inside the large raw chunk of memory (just like with the stack allocator).

Usually it is preferential to add a small header to each allocation to store this information. A heap node might look something like this:

struct HeapHeader { HeapHeader* next; HeapHeader* prev; int size; bool allocated; };

When the heap is first constructed it will contain a linked list of *HeapHeader* structs, but only a single header will be present, and it holds the entire piece of raw memory originally allocated by the OS upon the *Heap* allocator’s construction.

Allocating from the heap involves splitting a free *HeapHeader* into an allocated piece, and a new *HeapHeader* for the leftover space. The details of this lay mostly in the linked list implementation, and is not the focus of this article.

In order to reduce memory fragmentation it is a good idea to merge adjacent free *HeapHeader* links into a single link. This ought to be handled in the *Heap::Free* function. The details of merging free links lay mostly in the linked list implementation, and is not the focus of this article.

Here’s an example of what the *Heap* may look like in implementation:

class Heap { public: void* Allocate( int size ) { // Search linked list for a free link that can fit size header->allocated = true; // split header into two headers // mark the new header as not allocated return data + sizeof( Heapheader ); } Free( void* data ) { HeapHeader* header = data - sizeof( Heapheader ); header->allocated = false if ( header->next is free ) // Merge header into header->next merged = true; if (header->prev is free ) // Merge header into header->prev merged = true } private: lmHeader* m_memory; };

When *Heap::Allocate* is called a free link of appropriate size must be searched for. This has the time complexity of O( N ), and a lot of memory must be fetched into the cache upon allocation as the list itself is traversed. There are tricks to improve allocation performance of heaps, and a simple one would be the cache a single pointer to a free block in the heap itself. This pointer can be cached in *Heap::Free*, *Heap::Allocate*, or both. Once a new call to *Heap::Allocate* is made this cached pointer can be tested first to see it is an appropriate size.

There are two common ways to search through the links for an allocation: first fit and best fit. First fit will return the user with the first piece of memory large enough to hold the allocation. Best fit will return a chunk of memory that came from a *HeapHeader* with the smallest size that is still large enough to hold the requested allocation size.

First fit can be preferential for cache coherency, as it may prefer to allocate from the beginning of the heap and try to keep things closer together in memory. Best fit may be preferential for keeping the heap as un-fragmented as possible.

The heap based allocator intends to fight memory fragmentation through fitting links to allocation sizes, and by merging adjacent free memory blocks. This type of fragmentation is called *external fragmentation*. Another type of memory fragmentation is called *internal fragmentation*.

An internal memory fragmentation is when an allocated piece of memory is given to the user that actually holds more memory than the user requested. The user is assumed to not know about this extra piece of memory. This can provide an advantage to the allocator: all allocations can be of a fixed size, and any allocation larger than this fixed size is denied.

This lets the allocator act like an array. When an allocation is requested an empty element can be returned to the user. Upon freeing a piece of memory, the element is simply marked as free and placed into a free list.

The free list is a linked list of array elements. The memory in the free elements themselves should be used to store the pointer of each subsequent free element.

Allocation and deallocation become constant in time complexity and there is zero external memory fragmentation. In this way internal memory fragmentation is traded for external memory fragmentation.

The term “pages” comes into play when the array is filled up. Once an array is full of allocated elements another array can be allocated. Once this array is filled up, another one is allocated. Each array (aka page) can be stored in a singly linked list of pages.

The free list itself can pointer across multiple pages without any problems.

A page containing only free elements can be deleted entirely, though this feature might not need to be supported.

A paged allocator can also hold an array of singly linked lists of pages. Each element of this array can hold a list of pages that corresponds to a different element size. This can allow the paged allocator to fit different allocation requests into the most appropriate page list. A common tactic is to have pages that represent arrays with an element size of 2^N bytes, where N is usually at least 2, and smaller than some value K.

The biggest advantage of a paged allocator is zero external fragmentation. The internal fragmentation does make memory more non-homogeneous. This type of allocator will probably lower your cache line utilization. Cache line utilization would be how much memory in each cache line fetched from main memory to the CPU cache is actually used. Since internal fragmentation is a feature of a paged allocator, cache line utilization will probably suffer.

The unused memory in the pages can be reduced drastically on a per-application basis; if the users of the allocator are able to specify the element sizes of different page lists, then zero internal fragmentation can be achieved.

Instead of thinking of a paged allocator in terms of separate arrays, one might think of a simpler allocator that holds just a single array. If the elements within this array of of POD nature the array elements can be referenced by index. This lets the array grow or shrink in size as necessary, as new sized arrays can still be accessed by an old index.

Whenever the user wants a pointer to an element they first give the array an index, and a pointer is returned. This pointer is never stored anywhere! Continuous translation from index to pointer occurs -this allows the internal array itself to moved around in memory as necessary.

Users might need a little more power to refer to elements than a simple integer. Some type of handle might be needed to translate from index to pointer. Read more about handles here.

Given these three types of allocators an application should have all the variety of memory allocation necessary to run with pretty good performance. More advance allocation techniques definitely exist, and some are just combinations of the three basic allocators presented in this article.

Each allocator can be quite simple in isolation! I myself implemented a stack in about 100 lines, a paged allocator in 150, and a heap in about 250 lines of C++ code.

Further reading might include topics such as: cache coherency, memory alignment, garbage collection, virtual memory, page files (operating system pages).

]]>The dot product comes from the law of cosines. Here’s the formula:

\begin{equation}

c^2 = a^2 + b^2 – 2ab * cos\;\gamma

\label{eq1}

\end{equation}

This is just an equation that relates the cosine of an angle within a triangle to its various side lengths *a*, *b* and *c*. The Wikipedia page (link above) does a nice job of explaining this. Equation \eqref{eq1} can be rewritten as:

\begin{equation}

c^2 – a^2 – b^2 = -2ab * cos\;\gamma

\label{eq2}

\end{equation}

The right hand side equation \eqref{eq2} is interesting! Lets say that instead of writing the equation with side lengths *a*, *b* and *c*, it is written with two vectors: *u* and *v*. The third side can be represented as *u - v*. Re-writing equation \eqref{eq2} in vector notation yields:

\begin{equation}

|u\;-\;v|^2 – |u|^2 – |v|^2 = -2|u||v| * cos\;\gamma

\label{eq3}

\end{equation}

Which can be expressed in scalar form as:

\begin{equation}

(u_x\;-\;v_x)^2 + (u_y\;-\;v_y)^2 + (u_z\;-\;v_z)^2\;- \\

(u_{x}^2\;+\;u_{y}^2\;+\;u_{z}^2) – (v_{x}^2\;+\;v_{y}^2\;+\;v_{z}^2)\;= \\

-2|u||v| * cos\;\gamma

\label{eq4}

\end{equation}

By crossing out some redundant terms, and getting rid of the -2 on each side of the equation, this ugly equation can be turned into a much more approachable version:

\begin{equation}

u_x v_x + u_y v_y + u_w v_w = |u||v| * cos\;\gamma

\label{eq5}

\end{equation}

Equation \eqref{eq5} is the equation for the dot product. If both *u* and *v* are unit vectors then the equation will simplify to:

\begin{equation}

dot(\;\hat{u},\;\hat{v}\;) = cos\;\gamma

\label{eq6}

\end{equation}

If *u* and *v* are **not** unit vectors equation \eqref{eq5} says that the dot product between both vectors is equal to *cos( γ )* that has been scaled by the lengths of *u* and *v*. This is a nice thing to know! For example: the squared length of a vector is just itself dotted with itself.

If *u* is a unit vector and *v* is not, then *dot( u, v )* will return the distance in which *v* travels in the *u* direction. This is useful for understanding the plane equation in three dimensions (or any other dimension):

\begin{equation}

ax\;+\;by\;+\;cz\;-\;d\;=\;0

\end{equation}

The normal of a plane would be the vector: { *a, b, c* }. If this normal is a unit vector, then *d* represents the distance to the plane from the origin. If the normal is **not** a unit vector then *d* is scaled by the length of the normal.

To compute the distance of a point to this plane any point can be substituted into the plane equation, assuming the normal of the plane equation is of unit length. This operation is computing the distance along the normal a given point travels. The subtraction by *d* can be viewed as “translating the plane to the origin” in order to convert the distance along the normal, to a distance to the plane.

The simplest function for computing the distance to a plane (or line in 2D) would be to place a point into the plane equation. This means that we’ll have to either compute the plane equation in 2D if all we have are two points to represent the plane, and in 3D find a new tactic altogether since planes in 3D are not lines.

In my own experience I’ve found it most common to have a line in the form of two points in order to represent the parametric equation of a line. Two points can come from a triangle, a mesh edge, or two pieces of world geometry.

To setup the problem lets outline the function to be created as so:

float DistancePtLine( Vec2 a, Vec2 b, Vec2 p ) { }

The two parameters *a* and *b* are used to define the line segment itself. The direction of the line would be the vector *b – a*.

After a brief visit to the Wikipedia page for this exact problem I quickly wrote down my own derivation of the formula they have on their page. Take a look at this image I drew:

The problem of finding the distance of a point to a line makes use of finding the vector *d *that points from *p* to the closest point on the line *ab*. From the above picture: a simple way to calculate this vector would be to subtract away the portion of *a – p* that travels along the vector *ab*.

The part of *a – p* that travels along *ab* can be calculated by projecting *a – p* onto *ab*. This projection is described in the previous section about the dot product intuition.

Given the vector *d* the distance from *p* to *ab* is just *sqrt( **dot( d, d ) )*. The *sqrt* operation can be omit entirely to compute a distance squared. Our function may now look like:

float DistancePtLine( Vec2 a, Vec2 b, Vec2 p ) { Vec2 n = b - a; Vec2 pa = a - p; Vec2 c = n * (Dot( pa, n ) / Dot( n, n )); Vec2 d = pa - c; return sqrt( Dot( d, d ) ); }

This function is quite nice because it will never return a negative number. There is a popular version of this function that performs a division operation. Given a very small line segment as input for *ab* it is entirely possible to have the following function return a negative number:

float SqDistancePtLine( Vec2 a, Vec2 b, Vec2 p ) { Vec2 ab = b - a, ap = p - a, bp = p - b; float e = Dot( ap, ab ); return Dot( ap, ap ) - e * e / Dot( ab, ab ); }

It’s very misleading to have a function called “square distance” or “distance” to return a negative number. Passing in the result of this function to a *sqrt* function call can result in *NaN*s and be really nasty to deal with.

A full discussion of barycentric coordinates is way out of scope here. However, they can be used to compute distance from a point to line *segment*. The segment portion of the code just clamps a point projected into the line within the bounds of *a* and *b*.

Assuming readers are a little more comfortable with the dot product than I was when I first started programming, the following function should make sense:

float SqDistancePtSegment( Vec2 a, Vec2 b, Vec2 p ) { Vec2 n = b - a; Vec2 pa = a - p; float c = Dot( n, pa ); // Closest point is a if ( c > 0.0f ) return Dot( pa, pa ); Vec2 bp = p - b; // Closest point is b if ( Dot( n, bp ) > 0.0f ) return Dot( bp, bp ); // Closest point is between a and b Vec2 e = pa - n * (c / Dot( n, n )); return Dot( e, e ); }

This function can be adapted pretty easily to compute the closest point on the line segment to *p* instead of returning a scalar. The idea is to use the vector from *p* to the closest position on *ab* to project *p* onto the segment *ab*.

The above function works by computing barycentric coordinates of *p* relative to *ab*. The coordinates are scaled by the length of *ab* so the second if statement must be adapted slightly. If the direction *ab* were normalized then the second if statement would be a comparison with the value 1, which should make sense for barycentric coordinates.

Here’s a sample program you can try out yourself:

#include <cstdio> struct Vec2 { float x; float y; const Vec2 operator-( const Vec2& a ) { Vec2 v; v.x = x - a.x; v.y = y - a.y; return v; } const Vec2 operator*( float a ) const { Vec2 v; v.x = x * a; v.y = y * a; return v; } }; float Dot( const Vec2& a, const Vec2& b ) { return a.x * b.x + a.y * b.y; } int main( ) { Vec2 a, b, p; a.x = 1.0f; a.y = 1.0f; b.x = 5.0f; b.y = 2.0f; p.x = 3.0f; p.y = 3.0f; Vec2 n = b - a; Vec2 pa = a - p; Vec2 c = n * (Dot( n, pa ) / Dot( n, n )); Vec2 d = pa - c; float d2 = Dot( d, d ); printf( "Distance squared: %f\n", d2 ); }

The output is: “Distance squared: 2.117647″.

]]>This content is password protected. To view it please enter your password below:

]]>

- Pointers are dangerous
- Pointers are ambiguous and confusing
- NULL pointers lead to undefined behavior and crashes

Just google “pointers and references” and you’ll see bad advice everywhere. A new programmer seeing these bullet points is likely to get hyped about using references everywhere. Seeing advice like this just sort of upsets some part of me. Perhaps it’s because when the above statements are plastered onto websites they state them as fact.

In an effort to not make the same annoying mistake as every other article on the internet I’ll present my opinion *as an opinion*. By stating something as an opinion the reader will immediately begin to read with a certain amount of skepticism. This might coax newer readers into thinking for themselves, which ought to be the goal of writing an educational article on the first place. Writing step-by-step instructions on how not to use “dangerous pointers” is the worst way to write on the topic of pointers and references.

I know I sound pretty bitter. I recall a time when I browsed the internet and looked for advice on this exact topic. It takes time to unlearn bad things, and so this post was born.

Where things are in memory is a big deal. Memory access is commonly a bottleneck in real-time applications, and code that has ambiguous memory accesses patterns upsets me. Imagine peering into a large function that is operating on some kind of data. Scanning the middle of the function a few lines of code are encountered:

... Matrix a = b * c; // Vector d is assigned to a's second column d = a.Column( 2 ); ...

Just be reading this code it isn’t immediately understandable as what the variable *d* is. What is happening here? In order to know what the scope of d is some scrolling or manual code navigation will ensue. How long will this take? How much does it cut down on focus while the reader is just trying to understand the code?

Often times for member variables of an internal class or struct will be appended with the *m_* prefix. This is nice as readers immediately know the scope and implications of all uses of a member variable. There’s an implicit *this* pointer being accessed somehow, and the variable’s scope is relative to this class’s definition.

In this case there’s no such nice prefix. *d* can either be a reference or value type. There’s no way to know without some kind of intellisense. Mousing over a variable to see what the type is, given a nice IDE, is not really that big of a deal. The big deal here is that if you have to mouse over something to get an idea of what sort of memory this represents. Just take a look at this code:

... Matrix a = b * c; // Vector d is assigned to a's second column d->Set( a.Column( 2 ) ); ...

What sort of questions might the user have about this code? Clearly *d* is a pointer to some memory. This immediately informs the reader about the nature of the code. *d *is likely to be some kind of output, or perhaps a specific element in an array. It is definitely not just a lone variable on the stack. No intellisense is needed to get this information, no mousing over or code navigation is needed just to understand the idea of assigning a value to some non-local stack memory. The programmer reading this code might be able to focus slightly better on understanding the code due to the use of a pointer.

Sure *d* could technically be a *NULL* pointer (gasp), but in reality this is a non-issue. The only times checking for NULL pointers is important is when user-input is being handled. A lot of code doesn’t deal with user input! For internal code I’d make the argument that memory not on the local stack scope (local to at least the function currently executing” should almost always be referred to by pointer. Internal code can make assumptions on how pointers are used and not care about the NULL case. Internal code often solves difficult problems, and needs to be efficient (in the scope of real-time applications). Anything that fragments reader focus is bad, even taking a moment to mouse of a variable to see if it’s a reference or not.

AABB joined; MergeAABBs( a, b, joined );

In the above snippet imagine that the *joined* AABB is being written to, by finding the AABB that bounds both *a* and *b*. Perhaps in this specific case it is fairly obvious that *joined* is memory that is being written to by the *MergeAABBs* function. This is probably because *joined* was quite well named, despite being passed by reference to *MergeAABBs*. However this function might have been written in a way that returns a new AABB entirely by value, and only operates on a local stack copy of *joined*. In this case the code would compile and run perfectly fine, but *joined* would have unitialized memory. This might lead to a crash or assert, thus lower iteration time and programmer focus.

Now lets look at the use of a “dangerous” pointer:

AABB c; MergeAABBs( a, b, &c );

In this code snippet, no matter what the third parameter is named as, it is as obvious as possible that the function *MergeAABBs* is operating on some memory passed to it, and does not return anything useful in this particular use-case. The contents of the function *MergeAABBs* is probably obvious as well, I know I can imagine how it’s implemented without even looking; there’s just no ambiguity.

The name of variables should be meaningful to the problem the code is solving. Requiring a naming convention for code clarity simply because of an arbitrary reference function parameter is an unnecessary constraint! Naming things is hard enough without random convention limitations.

Sure if some idiot passed in a NULL pointer to *MergeAABBs* it will crash, but how often does this happen in practice? What kind of programmers are you hiring that actually get caught up in this kind of problem? Real-life competent engineers aren’t going pass in a NULL pointer and will appreciate good code written with “dangerous and ambiguous” pointers.

When a function takes only floats and writes to a float it’s pretty much worst-case for reference ambiguity. Which float is being written to in the next code snippet?

BarycentricTriangle( a, b, c, u, v, w );

Is the triangle actually *{a, b, c}*, or some other combination of parameters? Which of the arguments are float arrays (vectors) or just floats? Which ones are written to, and which are read only? Some kind of code navigation is needed to know for sure. By convention *uvw* might represent the name of barycentric coordinates for a triangle, but perhaps this specific piece of code was solving a particular problem where the derivation named them something else? It’s just ambiguous without a pre-defined naming convention, of which is imposed in the middle of non-related algorithms.

Here’s the pointer version; note how non-ambiguous this code is:

BarycentricTriangle( a, b, c, &u, &v, &w );

I currently know of a single use of references that I really like, and that’s a const reference passed to a function as an argument, and sometimes the returning of a const reference.

Passing a const reference to a function means that this is a read-only value, and is definitely not pointing to an array. It is also a pretty common convention for operator overloading. The only downside is that the dot access operator may be mistaken as a value-type access, instead of a pointer access.

Returning a const reference might make sense sometimes, but usually I’m of the opinion that a pointer is better. Returning a pointer just abides by all my previous points about memory access. If the user retrieves a const pointer from a function, the explicit -> access makes it very clear that this memory came from somewhere else!

References are also able to capture temporary rvalues. This can make the life-time of such temporary values more explicit.

Sometimes a million dereferences is just too many. In some cases the lack of the dereference operator is nice and adds to code readability. However, in this case references are just an aid and live only on the stack. The pointer is what is actually kept and stored, in order to keep the code clean and up-front. Here’s an example:

void AffineTransform( vec3* vecs, int n, const Matrix& m, const vec3& b ) { for ( int i = 0; i < n; ++i ) { vec3& v = vecs[ i ]; v = m * v + b; } }

An equivalent, but more verbose and annoying version can be constructed with pointers:

void AffineTransform( vec3* vecs, int n, const Matrix* m, const vec3* b ) { for ( int i = 0; i < n; ++i ) { vec3* v = vecs + i; *v = *m * *v + *b; } }

“Advanced C++” features (in quotations for sarcasm, like much of the rest of the article) are useful sometimes, there’s no doubt about. Templates, classes, and all the weirdness therein is sometimes necessary. The amount of code duplication and boilerplate that these features can be remove makes them important.

However, a lot of code is type-static and very specific. Code often solves very particular, specific problems. A lot of newer students (me) and colleagues get all caught up in the features and just end up wasting their time. When I say waste time, I mean they were actually trying to finish a project, instead of just learn about C++ and the uses thereof.

One might view C++ from the perspective that all the “advanced features” just can egg-on a programmer into over engineering their code into a weird mess of indirection and verbose templated code. Crazy inlined callbacks, type agnostic code, verbose namespaces and whatnot. Many times it’s just useless cruft, and a specific implementation for a single problem will be simpler, easier to read, and smaller in code size.

All of this ranting comes down the the point of: references let code operate in a slightly more agonistic manner, which is great for templates. The dot operator does it all! This makes sense for code that *needs* templating, but often doesn’t make sense for a lot of code (which was what the previous portions of the article pointed out!).

However, *good* generic code is so incredibly difficult to design and come up with that hardly anybody should be doing it. Good code that is used by multiple people is at least an order of magnitude harder to write than good code that has a single specific use case. Templates, references, classes, these things might make it tempting to try out all the features to make a “generic program” that “can be re-used in the future”. I’ll tell it how it is: simple code that is type static, specialized for the specific problem it is solving, and doesn’t use advanced features is probably an order of magnitude more reusable (and performant) than “generic” code, simply because it’s easier to write.

As a reader, think for yourself and make your own judgment calls based on your own experience. This means that a good way to take advantage of the knowledge of an experienced programmer is to try out their advice with an almost skeptical attitude. Just don’t look for step-by-step instructions on how to be a computer scientist. Nobody wants to work with a mindless programmer that writes bad code, because then the good programmers will be busy cleaning it up.

]]>