Author Archives: Randy Gaul

Code Duplication – C Macros

Sometimes extremely repetitive code needs to be written. With the advent of C++11 variadic templates resolve a lot of problems. However some compiler vendors are pretty lazy when it comes to actually implementing such features.

Beyond practicality there’s also something about learning new things simply out of some kind of hedonism for knowledge. This lead myself to looking up Boost implementation of their preprocessor macros in order to cut down in code duplication.

Imagine we need to write these functions 16 times or so:

The above example only shows 3 of the 16 overloads necessary for our imaginary purposes. Writing these things by hand is super tedious and error prone. It also gives me headaches. Instead lets uncover a way to write a single function and let the C preprocessor duplicate it 16 times over.

First lets just duplicate some text in semi-recursive manner with a LOOP macro:

The above code is the basis needed in order to perform some looping logic. Clearly the above LOOP macro has an upper limit of 3 and just pastes a number after some text. It has few use cases.

The above can be used to generate the functions themselves and represents a single semi-recursive loop. It will duplicate some text N times, where there is an upper limit to N (of 3 in the example). No interior loops are allowed. Interior loops are essential to our original goal of creating function overloads.

In order to support interior loops some additional macros can be used:

The ITERATE macro takes an invokable macro (a macro that uses the () parentheses operator) that takes the parameter N. The macro is invoked N times, and each time N is decremented by one. This lets N be passed into an interior loop called LOOP, where it is used N times decremented by 1 each time.

Each time N is passed to an invokable macro another layer of interior loops can be executed. In our case, we wanted a single layer of interior loops.

Currently the LOOP macro is quite limiting since it only appends N after some text. We can’t add in custom delimiters or anything. Parentheses and commas are not supported in the looped text.

Here’s the final version that supports looping of arbitrary pieces of text during interior loops:

The new LOOP macro works by taking an invokable macro itself. This lets the invokable macro passed to LOOP only be evaluated when necessary. This lets any arbitrary text reside within a LOOP macro, including parentheses and commas. The downside is that the user must define a macro for each usage of the LOOP macro.

A new PRINT macro has also been shown which is identical to LOOP except that it doesn’t print any commas. The above code generates the aforementioned function overloads, but only 3 of them. LOOP, PRINT and ITERATE macros can be copy/pasted up to 16 and the line ITERATE( FUNC, 3 ) can be modified to: ITERATE( FUNC, 16 ).

Success.

Nearly Geodesic Sphere – Mesh Creation

Capture

This post will let me store some useful code and possibly share it with others. I had originally found this wonderful explanation on stack overflow on how to generate triangles in a super simple manner for a nice sphere mesh.

The idea is to take an octohedron, or icosahedron, and subdivide the triangles on the surface of the mesh, such that each triangle creates 4 new triangles (each triangle creates a Triforce symbol).

One thing the stack overflow page didn’t describe is intermittent normalization between each subdivision. If you imagine subdividing an octohedron over and over without any normalization, the final resulting sphere will have triangles that vary in size quite a bit. However, if after each subdivision every single vertex is normalized then the vertices will snap to the unit sphere more often. This results in a final mesh that has triangles of closer to uniform geodesic area.

The final mesh isn’t purely geodesic and there will be variation in the size of the triangles, but it will be hardly noticeable. Sphere meshes will look super nice and also behave well when simulating soft bodies with Matyka’s pressure volume.

Here’s an example program you can use to perform some subdivisions upon an octohedron (click here to view the program’s output):

Inertia Tensor – Capsule

Capsule

A capsule is defined by two hemispheres and a cylinder. Capsules are pretty useful geometry for collision detection since they are easily defined mathematically (implicitly defined). This means that minimal information can be stored in memory to represent a capsule in 3D space. Given a radius value and position information only a few floating point values are necessary. Algorithms like GJK also work well with implicitly defined shapes since support mappings can be computed in constant time.

Given the inertia tensor of a cylinder and sphere the inertia tensor of a capsule can be calculated with the help of the parallel axis theorem. The parallel axis theorem can shift the origin that an inertia tensor is defined relative to, given just a translation vector. We care about the tensor form of the parallel axis theorem since it is probably easiest to understand from a computer science perspective:pat

J is the final transformed inertia. I is the initial inertia tensor. m is the mass of the object in question. R is the translation vector to shift with. E3 is the standard basis, or the identity matrix. The cross symbol is the outer product (see next paragraph).

Assuming readers are familiar with the dot product and outer product, computing the change in an inertia tensor isn’t too difficult.

The center of mass of a hemisphere is 3/8 * radius above the center of the spherical base. Knowing this and understanding the parallel axis theorem, the inertia tensor of one of the hemispheres can be easily calculated. My own derivation brought me the conclusion that the inertia tensor of capsule is:

Please note the final inertia tensor assumes the capsule is oriented such that the cylinder is aligned along the y axis. A rotation matrix R can be used to transform the final inertia tensor I into world space like so: R * I * R^T. To learn why this form is used read this post.

Special thanks to Dirk Gregorius for emailing me with an error in the original draft of this post! He kindly provided the public with a nice document showing his derivation of the inertia tensor of a capsule.

Following suit to identify the source of my error, I ended up writing my own derivation down in PDF format.

LIT (Lua Interpreted Triggers) – Starcraft: Brood War

Download LIT here

In the past I frequented a website called StarEdit.net in order to learn how to make scenario files for the game Starcraft: Brood War (SCBW). The scenario files could be generated with an editor, and had a simple trigger system. A lot of fun can be had making small games within these scenario files, and the games can be played over Blizzard’s Battle.net servers.

Blizzard’s stock editor was very limiting so some fan-made solutions popped up over the years. Creating triggers in these editors involved manually navigating a point-and-click GUI. The GUI was super cumbersome to use, and ended up being unruly for large numbers of triggers.

In an effort to revisit some fond memories of creating SCBW scenarios I decided to create a tool for generating triggers en masse (windows only). This post explains the details of how to use LIT.

Lua Interpreted Triggers (LIT)

Lua is a pretty good choice for creating some kind of tool to generate SCBW triggers. Since all that is needed from my tool is to generate some text, creating a super small and simple Lua API doesn’t take much development time, and can be very useful to use when making scenario files.

LIT is comprised of a small executable that embeds the Lua language, and a few Lua files to represents the LIT api.

Setting Up LIT

I’ve written a short post here on StarEdit.net about how to setup and run LIT from the command line.

LIT Examples

Since a SCBW trigger involves conditions and actions, using LIT involves laying out conditions and actions. There are two main resources for learning how to use LIT and they both come in the downloaded ZIP file. Inside of ActionsExample.lua and ConditionsExample.lua I’ve written out a short demonstration for how to use each unique condition and action in SCBW.

In order to learn how to use any particular condition or action, just consult these two example files.

However, since LIT is written with Lua the entire Lua language can be used to generate SCBW triggers! Since anyone that is interested in using LIT will probably not know how to write Lua code, I recommend reading some simple tutorial on using Lua before getting started. Here’s a decent looking one.

Lets take a look at writing a Binary Countoff using LIT:

The triggers generated by running this Lua file with LIT are: click to view.

As you can see there’s a little bit of state stored in a couple Deaths objects. That state is a number, a player and a unit. Using this state text is output in a straightforward manner.

Include Function

My favorite part about LIT is the include function! When writing out a LIT file, another entire LIT file can be included into it. Look at this example:

As the comment says, the include function will copy paste an entire Lua file straight into the spot where the include function is. A file included into another file can also include files. Files can be included through different folders, and the include function supports relative paths.

Lets assume that AnotherFile.lua holds the rest of the code from the first Binary Countoff example. If this is the case, then the output triggers will be exactly the same!

Say we have a file A and a file B. If A includes B and B includes A, then LIT will crash. Similarly, if any chain of files all form a circular inclusion, LIT will crash.

This lets users of LIT organize their Lua files in any manner they wish. One of the biggest drawbacks of using a GUI editor to create triggers, is once a couple thousand triggers exist in a single scenario file it becomes nearly impossible to efficiently navigate them. Using your operating system’s folders and file structure, organizing LIT is simple and powerful.

Demonstration Map

Here’s a link to download a demonstration map. The map contains a concept of screen-sized rooms that the player can move around with. It’s implemented with a few burrowed units on each room. The download link comes with the map file and the LIT files used to generate the map’s triggers. The most interesting file is RoomLogic.lua.

Memory Management

Any competent software engineer will have spent significant time working with low level memory management. Even though the operating system code is written for will often provide some kind of allocation and deallocation mechanism, application specific assumptions can be made to increase memory related performance.

For example certain hardware doesn’t have virtual memory support, or the virtual memory support can be quite lacking. A lack of virtual memory means raw allocations from the OS return real addresses to the hardware RAM. Usually virtual memory can alleviate some effects of memory fragmentation through a level of indirection, though when dealing with physical memory yourself no such alleviation exists.

This is just one example of how a software memory manager can be written and used to control memory fragmentation in a way that makes sense for the application.

Types of Allocators

There are a few main types of allocators that I myself have found pretty useful: paging, stack and heap based allocations. Each one makes specific assumptions about the types of allocations and how the memory ought be used. Due to these assumptions significant performance boosts can be reaped in ways that may not have been realistic with raw operating system allocations.

Stack Based Allocation

My favorite type of allocation involves the use of a simple stack. The idea is to make one large call to malloc or new and hold this piece of memory. The Stack itself just holds a pointer to this large chunk of memory, and an integer representing an index into the stack with an element size in bytes.

Here is what a Stack implementation might look like (in pseudo code):

Allocation can work by moving the m_memory pointer forward in the stack. Deallocation can work by moving the m_memory pointer backwards in the stack. Notice that the Free function requires the user to pass back in the size of the allocation! This can be avoided by storing this size parameter from Allocate inside of the m_memory array itself, just before the location of the returned address. Upon deallocation this value can be retrieved by moving the data parameter of Free back in memory by 4 bytes.

The advantage of the stack allocator is that it’s extremely fast and dubiously simple to implement. The limitation is that deallocations must be performed in the reverse order of allocations, since the stack itself is in LIFO order. This makes the use cases for the stack allocator pretty limited. Usually resources, like images, level files, sounds, models, etc. can be loaded into memory with a stack based allocator. Anything that has a very clear and non-variable lifespan should be able to be allocated on a stack.

One last trick is that the last allocation can be trivially resized! Often times an algorithm will require a lot of temporary scratch memory to perform some calculations, or store some state. An initial guess as to how much memory is needed can often be calculated as the worst-case scenario. Once an algorithm finishes this scratch memory can be reduced to the size actually used, if it is the last allocation on the stack. Resizing the last stack allocation involves moving the index backwards in memory.

Heap Allocation

Implementing your own heaps is pretty similar to the stack based allocator. A heap allocator will use the operating system to allocate a large chunk of memory. Subsequent calls to the heap’s Allocate and Free methods will just dip into this chunk and fetch a piece.

The heap is more versatile and general purpose than a stack allocator. The heap can be implemented with a linked list of nodes. Each node represents a piece of memory. A node can either be allocated or free. To keep track of these linked list pointers, allocation state, and size of the memory block some memory itself is required! This stuff can be stored in a separate array, or right inside the large raw chunk of memory (just like with the stack allocator).

Usually it is preferential to add a small header to each allocation to store this information. A heap node might look something like this:

When the heap is first constructed it will contain a linked list of HeapHeader structs, but only a single header will be present, and it holds the entire piece of raw memory originally allocated by the OS upon the Heap allocator’s construction.

Allocating from the heap involves splitting a free HeapHeader into an allocated piece, and a new HeapHeader for the leftover space. The details of this lay mostly in the linked list implementation, and is not the focus of this article.

In order to reduce memory fragmentation it is a good idea to merge adjacent free HeapHeader links into a single link. This ought to be handled in the Heap::Free function. The details of merging free links lay mostly in the linked list implementation, and is not the focus of this article.

Here’s an example of what the Heap may look like in implementation:

When Heap::Allocate is called a free link of appropriate size must be searched for. This has the time complexity of O( N ), and a lot of memory must be fetched into the cache upon allocation as the list itself is traversed. There are tricks to improve allocation performance of heaps, and a simple one would be the cache a single pointer to a free block in the heap itself. This pointer can be cached in Heap::FreeHeap::Allocate, or both. Once a new call to Heap::Allocate is made this cached pointer can be tested first to see it is an appropriate size.

There are two common ways to search through the links for an allocation: first fit and best fit. First fit will return the user with the first piece of memory large enough to hold the allocation. Best fit will return a chunk of memory that came from a HeapHeader with the smallest size that is still large enough to hold the requested allocation size.

First fit can be preferential for cache coherency, as it may prefer to allocate from the beginning of the heap and try to keep things closer together in memory. Best fit may be preferential for keeping the heap as un-fragmented as possible.

Paged Allocation

The heap based allocator intends to fight memory fragmentation through fitting links to allocation sizes, and by merging adjacent free memory blocks. This type of fragmentation is called external fragmentation. Another type of memory fragmentation is called internal fragmentation.

An internal memory fragmentation is when an allocated piece of memory is given to the user that actually holds more memory than the user requested. The user is assumed to not know about this extra piece of memory. This can provide an advantage to the allocator: all allocations can be of a fixed size, and any allocation larger than this fixed size is denied.

This lets the allocator act like an array. When an allocation is requested an empty element can be returned to the user. Upon freeing a piece of memory, the element is simply marked as free and placed into a free list.

The free list is a linked list of array elements. The memory in the free elements themselves should be used to store the pointer of each subsequent free element.

Allocation and deallocation become constant in time complexity and there is zero external memory fragmentation. In this way internal memory fragmentation is traded for external memory fragmentation.

Pages!

The term “pages” comes into play when the array is filled up. Once an array is full of allocated elements another array can be allocated. Once this array is filled up, another one is allocated. Each array (aka page) can be stored in a singly linked list of pages.

The free list itself can pointer across multiple pages without any problems.

A page containing only free elements can be deleted entirely, though this feature might not need to be supported.

A paged allocator can also hold an array of singly linked lists of pages. Each element of this array can hold a list of pages that corresponds to a different element size. This can allow the paged allocator to fit different allocation requests into the most appropriate page list. A common tactic is to have pages that represent arrays with an element size of 2^N bytes, where N is usually at least 2, and smaller than some value K.

The biggest advantage of a paged allocator is zero external fragmentation. The internal fragmentation does make memory more non-homogeneous. This type of allocator will probably lower your cache line utilization. Cache line utilization would be how much memory in each cache line fetched from main memory to the CPU cache is actually used. Since internal fragmentation is a feature of a paged allocator, cache line utilization will probably suffer.

The unused memory in the pages can be reduced drastically on a per-application basis; if the users of the allocator are able to specify the element sizes of different page lists, then zero internal fragmentation can be achieved.

Handle-Based Array

Instead of thinking of a paged allocator in terms of separate arrays, one might think of a simpler allocator that holds just a single array. If the elements within this array of of POD nature the array elements can be referenced by index. This lets the array grow or shrink in size as necessary, as new sized arrays can still be accessed by an old index.

Whenever the user wants a pointer to an element they first give the array an index, and a pointer is returned. This pointer is never stored anywhere! Continuous translation from index to pointer occurs -this allows the internal array itself to moved around in memory as necessary.

Users might need a little more power to refer to elements than a simple integer. Some type of handle might be needed to translate from index to pointer. Read more about handles here.

Conclusion

Given these three types of allocators an application should have all the variety of memory allocation necessary to run with pretty good performance. More advance allocation techniques definitely exist, and some are just combinations of the three basic allocators presented in this article.

Each allocator can be quite simple in isolation! I myself implemented a stack in about 100 lines, a paged allocator in 150, and a heap in about 250 lines of C++ code.

Further reading might include topics such as: cache coherency, memory alignment, garbage collection, virtual memory, page files (operating system pages).

Distance Point to Line Segment

2014-07-23 00.16.37

Understanding the dot product will enable one to come up with an algorithm to compute the distance between a point and line in two or three dimensions without much fuss. This is a math problem but discussion and focus is on writing a function to compute distance.

Dot Product Intuition

The dot product comes from the law of cosines. Here’s the formula:

\begin{equation}
c^2 = a^2 + b^2 – 2ab * cos\;\gamma
\label{eq1}
\end{equation}

This is just an equation that relates the cosine of an angle within a triangle to its various side lengths a, b and c. The Wikipedia page (link above) does a nice job of explaining this. Equation \eqref{eq1} can be rewritten as:

\begin{equation}
c^2 – a^2 – b^2 = -2ab * cos\;\gamma
\label{eq2}
\end{equation}

The right hand side equation \eqref{eq2} is interesting! Lets say that instead of writing the equation with side lengths a, b and c, it is written with two vectors: u and v. The third side can be represented as u - v. Re-writing equation \eqref{eq2} in vector notation yields:

\begin{equation}
|u\;-\;v|^2 – |u|^2 – |v|^2 = -2|u||v| * cos\;\gamma
\label{eq3}
\end{equation}

Which can be expressed in scalar form as:

\begin{equation}
(u_x\;-\;v_x)^2 + (u_y\;-\;v_y)^2 + (u_z\;-\;v_z)^2\;- \\
(u_{x}^2\;+\;u_{y}^2\;+\;u_{z}^2) – (v_{x}^2\;+\;v_{y}^2\;+\;v_{z}^2)\;= \\
-2|u||v| * cos\;\gamma
\label{eq4}
\end{equation}

By crossing out some redundant terms, and getting rid of the -2 on each side of the equation, this ugly equation can be turned into a much more approachable version:

\begin{equation}
u_x v_x + u_y v_y + u_w v_w = |u||v| * cos\;\gamma
\label{eq5}
\end{equation}

Equation \eqref{eq5} is the equation for the dot product. If both u and v are unit vectors then the equation will simplify to:

\begin{equation}
dot(\;\hat{u},\;\hat{v}\;) = cos\;\gamma
\label{eq6}
\end{equation}

If u and v are not unit vectors equation \eqref{eq5} says that the dot product between both vectors is equal to cos( γ ) that has been scaled by the lengths of u and v. This is a nice thing to know! For example: the squared length of a vector is just itself dotted with itself.

If u is a unit vector and v is not, then dot( u, v ) will return the distance in which v travels in the u direction. This is useful for understanding the plane equation in three dimensions (or any other dimension):

\begin{equation}
ax\;+\;by\;+\;cz\;-\;d\;=\;0
\end{equation}

The normal of a plane would be the vector: { a, b, c }. If this normal is a unit vector, then d represents the distance to the plane from the origin. If the normal is not a unit vector then d is scaled by the length of the normal.

To compute the distance of a point to this plane any point can be substituted into the plane equation, assuming the normal of the plane equation is of unit length. This operation is computing the distance along the normal a given point travels. The subtraction by d can be viewed as “translating the plane to the origin” in order to convert the distance along the normal, to a distance to the plane.

Writing the Function: Distance Point to Line

The simplest function for computing the distance to a plane (or line in 2D) would be to place a point into the plane equation. This means that we’ll have to either compute the plane equation in 2D if all we have are two points to represent the plane, and in 3D find a new tactic altogether since planes in 3D are not lines.

In my own experience I’ve found it most common to have a line in the form of two points in order to represent the parametric equation of a line. Two points can come from a triangle, a mesh edge, or two pieces of world geometry.

To setup the problem lets outline the function to be created as so:

The two parameters a and b are used to define the line segment itself. The direction of the line would be the vector b – a.

After a brief visit to the Wikipedia page for this exact problem I quickly wrote down my own derivation of the formula they have on their page. Take a look at this image I drew:

2014-07-23 00.16.37

The problem of finding the distance of a point to a line makes use of finding the vector that points from p to the closest point on the line ab. From the above picture: a simple way to calculate this vector would be to subtract away the portion of a – p that travels along the vector ab.

The part of a – p that travels along ab can be calculated by projecting a – p onto ab. This projection is described in the previous section about the dot product intuition.

Given the vector d the distance from p to ab is just sqrt( dot( d, d ) ). The sqrt operation can be omit entirely to compute a distance squared. Our function may now look like:

This function is quite nice because it will never return a negative number. There is a popular version of this function that performs a division operation. Given a very small line segment as input for ab it is entirely possible to have the following function return a negative number:

It’s very misleading to have a function called “square distance” or “distance” to return a negative number. Passing in the result of this function to a sqrt function call can result in NaNs and be really nasty to deal with.

Barycentric Coordinates – Segments

A full discussion of barycentric coordinates is way out of scope here. However, they can be used to compute distance from a point to line segment. The segment portion of the code just clamps a point projected into the line within the bounds of a and b.

Assuming readers are a little more comfortable with the dot product than I was when I first started programming, the following function should make sense:

This function can be adapted pretty easily to compute the closest point on the line segment to p instead of returning a scalar. The idea is to use the vector from p to the closest position on ab to project p onto the segment ab.

The above function works by computing barycentric coordinates of p relative to ab. The coordinates are scaled by the length of ab so the second if statement must be adapted slightly. If the direction ab were normalized then the second if statement would be a comparison with the value 1, which should make sense for barycentric coordinates.

Sample Program

Here’s a sample program you can try out yourself:

The output is: “Distance squared: 2.117647″.

What is there to Hate about References?

I find most usage of references annoying cruft. Often the arguments I see or hear that are “pro-reference” make the same lame points that most of the internet makes:

  • Pointers are dangerous
  • Pointers are ambiguous and confusing
  • NULL pointers lead to undefined behavior and crashes

Just google “pointers and references” and you’ll see bad advice everywhere. A new programmer seeing these bullet points is likely to get hyped about using references everywhere. Seeing advice like this just sort of upsets some part of me. Perhaps it’s because when the above statements are plastered onto websites they state them as fact.

In an effort to not make the same annoying mistake as every other article on the internet I’ll present my opinion as an opinion. By stating something as an opinion the reader will immediately begin to read with a certain amount of skepticism. This might coax newer readers into thinking for themselves, which ought to be the goal of writing an educational article on the first place. Writing step-by-step instructions on how not to use “dangerous pointers” is the worst way to write on the topic of pointers and references.

I know I sound pretty bitter. I recall a time when I browsed the internet and looked for advice on this exact topic. It takes time to unlearn bad things, and so this post was born.

Memory Matters

Where things are in memory is a big deal. Memory access is commonly a bottleneck in real-time applications, and code that has ambiguous memory accesses patterns upsets me. Imagine peering into a large function that is operating on some kind of data. Scanning the middle of the function a few lines of code are encountered:

Just be reading this code it isn’t immediately understandable as what the variable d is. What is happening here? In order to know what the scope of d is some scrolling or manual code navigation will ensue. How long will this take? How much does it cut down on focus while the reader is just trying to understand the code?

Often times for member variables of an internal class or struct will be appended with the m_ prefix. This is nice as readers immediately know the scope and implications of all uses of a member variable. There’s an implicit this pointer being accessed somehow, and the variable’s scope is relative to this class’s definition.

In this case there’s no such nice prefix. d can either be a reference or value type. There’s no way to know without some kind of intellisense. Mousing over a variable to see what the type is, given a nice IDE, is not really that big of a deal. The big deal here is that if you have to mouse over something to get an idea of what sort of memory this represents. Just take a look at this code:

What sort of questions might the user have about this code? Clearly d is a pointer to some memory. This immediately informs the reader about the nature of the code. is likely to be some kind of output, or perhaps a specific element in an array. It is definitely not just a lone variable on the stack. No intellisense is needed to get this information, no mousing over or code navigation is needed just to understand the idea of assigning a value to some non-local stack memory. The programmer reading this code might be able to focus slightly better on understanding the code due to the use of a pointer.

Sure d could technically be a *NULL* pointer (gasp), but in reality this is a non-issue. The only times checking for NULL pointers is important is when user-input is being handled. A lot of code doesn’t deal with user input! For internal code I’d make the argument that memory not on the local stack scope (local to at least the function currently executing” should almost always be referred to by pointer. Internal code can make assumptions on how pointers are used and not care about the NULL case. Internal code often solves difficult problems, and needs to be efficient (in the scope of real-time applications). Anything that fragments reader focus is bad, even taking a moment to mouse of a variable to see if it’s a reference or not.

Another Example

In the above snippet imagine that the joined AABB is being written to, by finding the AABB that bounds both a and b. Perhaps in this specific case it is fairly obvious that joined is memory that is being written to by the MergeAABBs function. This is probably because joined was quite well named, despite being passed by reference to MergeAABBs. However this function might have been written in a way that returns a new AABB entirely by value, and only operates on a local stack copy of joined. In this case the code would compile and run perfectly fine, but joined would have unitialized memory. This might lead to a crash or assert, thus lower iteration time and programmer focus.

Now lets look at the use of a “dangerous” pointer:

In this code snippet, no matter what the third parameter is named as, it is as obvious as possible that the function MergeAABBs is operating on some memory passed to it, and does not return anything useful in this particular use-case. The contents of the function MergeAABBs is probably obvious as well, I know I can imagine how it’s implemented without even looking; there’s just no ambiguity.

The name of variables should be meaningful to the problem the code is solving. Requiring a naming convention for code clarity simply because of an arbitrary reference function parameter is an unnecessary constraint! Naming things is hard enough without random convention limitations.

Sure if some idiot passed in a NULL pointer to MergeAABBs it will crash, but how often does this happen in practice? What kind of programmers are you hiring that actually get caught up in this kind of problem? Real-life competent engineers aren’t going pass in a NULL pointer and will appreciate good code written with “dangerous and ambiguous” pointers.

When a function takes only floats and writes to a float it’s pretty much worst-case for reference ambiguity. Which float is being written to in the next code snippet?

Is the triangle actually {a, b, c}, or some other combination of parameters? Which of the arguments are float arrays (vectors) or just floats? Which ones are written to, and which are read only? Some kind of code navigation is needed to know for sure. By convention uvw might represent the name of barycentric coordinates for a triangle, but perhaps this specific piece of code was solving a particular problem where the derivation named them something else? It’s just ambiguous without a pre-defined naming convention, of which is imposed in the middle of non-related algorithms.

Here’s the pointer version; note how non-ambiguous this code is:

Useful References

I currently know of a single use of references that I really like, and that’s a const reference passed to a function as an argument, and sometimes the returning of a const reference.

Passing a const reference to a function means that this is a read-only value, and is definitely not pointing to an array. It is also a pretty common convention for operator overloading. The only downside is that the dot access operator may be mistaken as a value-type access, instead of a pointer access.

Returning a const reference might make sense sometimes, but usually I’m of the opinion that a pointer is better. Returning a pointer just abides by all my previous points about memory access. If the user retrieves a const pointer from a function, the explicit -> access makes it very clear that this memory came from somewhere else!

References are also able to capture temporary rvalues. This can make the life-time of such temporary values more explicit.

Sometimes a million dereferences is just too many. In some cases the lack of the dereference operator is nice and adds to code readability. However, in this case references are just an aid and live only on the stack. The pointer is what is actually kept and stored, in order to keep the code clean and up-front. Here’s an example:

An equivalent, but more verbose and annoying version can be constructed with pointers:

“Advanced C++” and Generic Programming

“Advanced C++” features (in quotations for sarcasm, like much of the rest of the article) are useful sometimes, there’s no doubt about. Templates, classes, and all the weirdness therein is sometimes necessary. The amount of code duplication and boilerplate that these features can be remove makes them important.

However, a lot of code is type-static and very specific. Code often solves very particular, specific problems. A lot of newer students (me) and colleagues get all caught up in the features and just end up wasting their time. When I say waste time, I mean they were actually trying to finish a project, instead of just learn about C++ and the uses thereof.

One might view C++ from the perspective that all the “advanced features” just can egg-on a programmer into over engineering their code into a weird mess of indirection and verbose templated code. Crazy inlined callbacks, type agnostic code, verbose namespaces and whatnot. Many times it’s just useless cruft, and a specific implementation for a single problem will be simpler, easier to read, and smaller in code size.

All of this ranting comes down the the point of: references let code operate in a slightly more agonistic manner, which is great for templates. The dot operator does it all! This makes sense for code that needs templating, but often doesn’t make sense for a lot of code (which was what the previous portions of the article pointed out!).

However, good generic code is so incredibly difficult to design and come up with that hardly anybody should be doing it. Good code that is used by multiple people is at least an order of magnitude harder to write than good code that has a single specific use case. Templates, references, classes, these things might make it tempting to try out all the features to make a “generic program” that “can be re-used in the future”. I’ll tell it how it is: simple code that is type static, specialized for the specific problem it is solving, and doesn’t use advanced features is probably an order of magnitude more reusable (and performant) than “generic” code, simply because it’s easier to write.

Form an Opinion

As a reader, think for yourself and make your own judgment calls based on your own experience. This means that a good way to take advantage of the knowledge of an experienced programmer is to try out their advice with an almost skeptical attitude. Just don’t look for step-by-step instructions on how to be a computer scientist. Nobody wants to work with a mindless programmer that writes bad code, because then the good programmers will be busy cleaning it up.