Category Archives: Articles

Single File Libraries – |

Sean T. Barrett makes a lot of very cool single-file libraries in C. Recently he’s also been making another big list of other single-file (or two files with src/header) that he likes.

The great thing about Sean’s libraries is that they contain functions that do exactly what they intend to accomplish, without doing anything more or less. They don’t contain any extra complexity other than specifically what is needed to get the job done. This helps when preventing his libraries from creating external dependencies, meaning his libs can be deployed as a single file inclusion into a pre-existing project, without linking to external libraries or requiring additional headers.

These qualities make libraries like Sean’s extremely easy to hookup and get going. If you want to learn how to make quality libraries just look at any of the STB libraries.

Writing Libraries

Sean writes libraries in an old version of Visual Studio (I think VC6?), and codes in C. He also keeps all of his code inside of a single file while writing it — I’ve seen this on For the rest of us that aren’t as crazy or hardcore as Sean we can just use is a Perl script written by an unknown author (probably written by Richard Mitton). I found the file inside of Mitton’s single-file library “tigr”, which stands for Tiny Graphics Library. Mitton uses to recursively include a bunch of files into one larger c file. Check out the script yourself:

The idea is to make a dummy C file that includes other source files, like this:

Then can be run from the command line (assuming you have Perl installed) like so:

The script outputs some nice and quick text to stdout displaying which files were visited and packages the entire source tree into a single source file. The output file contains nice comments indicated the beginning and ending of files, like this excerpt from one of my own single-file libraries:


Mitton writes about the old joys of using the incbin command in assembly, where the assembler would embed the binary contents of a file straight into your program. It sounds like this gets dubious when dealing with linkers nowadays (I’ve had some really long link times by including large files into source code…), though it still happens from time to time in small single-file libraries.

An example is in Mitton’s tigr library where he uses a makeshift perl script “” to embed shaders and a png file (containing raster font glyphs) straight into C source code. This concept can also be seen in Omar Ocornut’s imgui library where some font files are embedded into the source. Omar seems to have used a small C program to generate the binary data as C source.

Again, I’m not sure who originally wrote this script but it was probably Mitton himself.


What is there to Hate about References?

I find most usage of references annoying cruft. Often the arguments I see or hear that are “pro-reference” make the same lame points that most of the internet makes:

  • Pointers are dangerous
  • Pointers are ambiguous and confusing
  • NULL pointers lead to undefined behavior and crashes

Just google “pointers and references” and you’ll see bad advice everywhere. A new programmer seeing these bullet points is likely to get hyped about using references everywhere. Seeing advice like this just sort of upsets some part of me. Perhaps it’s because when the above statements are plastered onto websites they state them as fact.

In an effort to not make the same annoying mistake as every other article on the internet I’ll present my opinion as an opinion. By stating something as an opinion the reader will immediately begin to read with a certain amount of skepticism. This might coax newer readers into thinking for themselves, which ought to be the goal of writing an educational article on the first place. Writing step-by-step instructions on how not to use “dangerous pointers” is the worst way to write on the topic of pointers and references.

I know I sound pretty bitter. I recall a time when I browsed the internet and looked for advice on this exact topic. It takes time to unlearn bad things, and so this post was born.

Memory Matters

Where things are in memory is a big deal. Memory access is commonly a bottleneck in real-time applications, and code that has ambiguous memory accesses patterns upsets me. Imagine peering into a large function that is operating on some kind of data. Scanning the middle of the function a few lines of code are encountered:

Just be reading this code it isn’t immediately understandable as what the variable d is. What is happening here? In order to know what the scope of d is some scrolling or manual code navigation will ensue. How long will this take? How much does it cut down on focus while the reader is just trying to understand the code?

Often times for member variables of an internal class or struct will be appended with the m_ prefix. This is nice as readers immediately know the scope and implications of all uses of a member variable. There’s an implicit this pointer being accessed somehow, and the variable’s scope is relative to this class’s definition.

In this case there’s no such nice prefix. d can either be a reference or value type. There’s no way to know without some kind of intellisense. Mousing over a variable to see what the type is, given a nice IDE, is not really that big of a deal. The big deal here is that if you have to mouse over something to get an idea of what sort of memory this represents. Just take a look at this code:

What sort of questions might the user have about this code? Clearly d is a pointer to some memory. This immediately informs the reader about the nature of the code. is likely to be some kind of output, or perhaps a specific element in an array. It is definitely not just a lone variable on the stack. No intellisense is needed to get this information, no mousing over or code navigation is needed just to understand the idea of assigning a value to some non-local stack memory. The programmer reading this code might be able to focus slightly better on understanding the code due to the use of a pointer.

Sure d could technically be a *NULL* pointer (gasp), but in reality this is a non-issue. The only times checking for NULL pointers is important is when user-input is being handled. A lot of code doesn’t deal with user input! For internal code I’d make the argument that memory not on the local stack scope (local to at least the function currently executing” should almost always be referred to by pointer. Internal code can make assumptions on how pointers are used and not care about the NULL case. Internal code often solves difficult problems, and needs to be efficient (in the scope of real-time applications). Anything that fragments reader focus is bad, even taking a moment to mouse of a variable to see if it’s a reference or not.

Another Example

In the above snippet imagine that the joined AABB is being written to, by finding the AABB that bounds both a and b. Perhaps in this specific case it is fairly obvious that joined is memory that is being written to by the MergeAABBs function. This is probably because joined was quite well named, despite being passed by reference to MergeAABBs. However this function might have been written in a way that returns a new AABB entirely by value, and only operates on a local stack copy of joined. In this case the code would compile and run perfectly fine, but joined would have unitialized memory. This might lead to a crash or assert, thus lower iteration time and programmer focus.

Now lets look at the use of a “dangerous” pointer:

In this code snippet, no matter what the third parameter is named as, it is as obvious as possible that the function MergeAABBs is operating on some memory passed to it, and does not return anything useful in this particular use-case. The contents of the function MergeAABBs is probably obvious as well, I know I can imagine how it’s implemented without even looking; there’s just no ambiguity.

The name of variables should be meaningful to the problem the code is solving. Requiring a naming convention for code clarity simply because of an arbitrary reference function parameter is an unnecessary constraint! Naming things is hard enough without random convention limitations.

Sure if some idiot passed in a NULL pointer to MergeAABBs it will crash, but how often does this happen in practice? What kind of programmers are you hiring that actually get caught up in this kind of problem? Real-life competent engineers aren’t going pass in a NULL pointer and will appreciate good code written with “dangerous and ambiguous” pointers.

When a function takes only floats and writes to a float it’s pretty much worst-case for reference ambiguity. Which float is being written to in the next code snippet?

Is the triangle actually {a, b, c}, or some other combination of parameters? Which of the arguments are float arrays (vectors) or just floats? Which ones are written to, and which are read only? Some kind of code navigation is needed to know for sure. By convention uvw might represent the name of barycentric coordinates for a triangle, but perhaps this specific piece of code was solving a particular problem where the derivation named them something else? It’s just ambiguous without a pre-defined naming convention, of which is imposed in the middle of non-related algorithms.

Here’s the pointer version; note how non-ambiguous this code is:

Useful References

I currently know of a single use of references that I really like, and that’s a const reference passed to a function as an argument, and sometimes the returning of a const reference.

Passing a const reference to a function means that this is a read-only value, and is definitely not pointing to an array. It is also a pretty common convention for operator overloading. The only downside is that the dot access operator may be mistaken as a value-type access, instead of a pointer access.

Returning a const reference might make sense sometimes, but usually I’m of the opinion that a pointer is better. Returning a pointer just abides by all my previous points about memory access. If the user retrieves a const pointer from a function, the explicit -> access makes it very clear that this memory came from somewhere else!

References are also able to capture temporary rvalues. This can make the life-time of such temporary values more explicit.

Sometimes a million dereferences is just too many. In some cases the lack of the dereference operator is nice and adds to code readability. However, in this case references are just an aid and live only on the stack. The pointer is what is actually kept and stored, in order to keep the code clean and up-front. Here’s an example:

An equivalent, but more verbose and annoying version can be constructed with pointers:

“Advanced C++” and Generic Programming

“Advanced C++” features (in quotations for sarcasm, like much of the rest of the article) are useful sometimes, there’s no doubt about. Templates, classes, and all the weirdness therein is sometimes necessary. The amount of code duplication and boilerplate that these features can be remove makes them important.

However, a lot of code is type-static and very specific. Code often solves very particular, specific problems. A lot of newer students (me) and colleagues get all caught up in the features and just end up wasting their time. When I say waste time, I mean they were actually trying to finish a project, instead of just learn about C++ and the uses thereof.

One might view C++ from the perspective that all the “advanced features” just can egg-on a programmer into over engineering their code into a weird mess of indirection and verbose templated code. Crazy inlined callbacks, type agnostic code, verbose namespaces and whatnot. Many times it’s just useless cruft, and a specific implementation for a single problem will be simpler, easier to read, and smaller in code size.

All of this ranting comes down the the point of: references let code operate in a slightly more agonistic manner, which is great for templates. The dot operator does it all! This makes sense for code that needs templating, but often doesn’t make sense for a lot of code (which was what the previous portions of the article pointed out!).

However, good generic code is so incredibly difficult to design and come up with that hardly anybody should be doing it. Good code that is used by multiple people is at least an order of magnitude harder to write than good code that has a single specific use case. Templates, references, classes, these things might make it tempting to try out all the features to make a “generic program” that “can be re-used in the future”. I’ll tell it how it is: simple code that is type static, specialized for the specific problem it is solving, and doesn’t use advanced features is probably an order of magnitude more reusable (and performant) than “generic” code, simply because it’s easier to write.

Form an Opinion

As a reader, think for yourself and make your own judgment calls based on your own experience. This means that a good way to take advantage of the knowledge of an experienced programmer is to try out their advice with an almost skeptical attitude. Just don’t look for step-by-step instructions on how to be a computer scientist. Nobody wants to work with a mindless programmer that writes bad code, because then the good programmers will be busy cleaning it up.


Sane Usage of Components and Entity Systems

With some discussion going in a previous article about how to actually implement some sort of component system for a game engine, without vague theory or dogma, a need for some higher level perspective was reached, and so this article arose.

In general an aggregation model is often useful when piecing together bits of functionality or data to create something new. The ability to do so is very useful for writing game-specific gameplay code due the flexibility of code granted by aggregation. However as of late there’s been tremendous talk about OOP, Entity Systems, Inheritance, and blah blah blah within the online indie development community. More and more buzzwords get tossed around by big name writers and the audience really just looks for some guidelines to follow in hopes of writing good code.

Sadly there isn’t going to be a set of step by step rules for writing a game engine or coming up with a good architecture. Like many of said before me, writing a game is a specific task requiring specific solutions. Why do you think game engine developers such as Epic or the Unity guys have so many people working on the product? Because a generic game engine is a huge piece of software that requires a lot of features. Some features exist simply to let users add in custom features easily.

Components, aggregation, Entity Component Systems, Entity systems, these are just words and have various definitions (depending on who you ask).

Some Definitions

To hopefully avoid silly arguments and confusion lets define some terms. If you don’t like the definitions here feel free to express so, I’m all up for criticism and debate.

    • Component Based Architecture
      • A preference for aggregation over inheritance. Is just a concept and does not lead to a single specific implementation. A game object is a collection of components. A component defines data and/or functionality for a concept.
    • Entity Component System (ECS)
      • A specific implementation of Component Based Architecture. A game object would be an ID (an integer). The ID is used to form an aggregate. Usually an ECS implies an implementation similar to a database, where components are entries into a database that are looked up through some identifier. The main goals of this implementation are efficiency and simplicity. Often times the term “ECS” is used just to describe a Component Based Architecture, often leading to confusion.
    • Aggregation
      • I like to think of this as a “has-a” relationship over an “is-a” relationship. Aggregation refers to one object “having” another object, which implies an aggregate is a collection (data structure) of other objects.

Some Truth and History

Aggregation is useful from a game design perspective. It frees functionality from arbitrary classification (classes and inheritance). Classes were originally created in C++ to let a programmer tie together a piece of data and some functionality to represent some sort of real-life concept. This is in simplest terms the essence of Object Oriented Programming (OOP). Over time more features were added to help engineer relationships between classes, one such feature came in the form of inheritance.

There’s nothing inherently wrong with OOP and it makes sense in a lot of code. Problems can arise when there’s a mis-application of OOP that has implications that aren’t fully understood at the time of implementation that cause negative affects down the road. I’m sure we’ve all seen the code migration and mega-class example so commonly thrown around in articles arguing against OOP and inheritance abuse.

In response to such an abuse a new paradigm became popularized which focused on aggregation of functionality to form an object. This might be called a “component based architecture”. In general aggregation can be considered an appropriate alternative to inheritance.

OOP Diatribe

Usually when an article spews forth caustic attacks against OOP it’s directed at naive implementations that disregard implications of how memory is accessed. Perhaps in the past the bottleneck of most everything was processor speed, so a lot of literature focuses on this. Nowadays CPUs on the PC have an architecture that have ridiculous computational power with extremely limited memory access. In general one might consider accessing memory from RAM 300 times slower than multiplying two floats together. Of course this last statement is extremely anecdotal without any evidence, but exists just to give a rough perspective of reality in many current (2014) cases.

If objects with associated code (classes) are just allocated and deallocated on the heap at will then a performance bottleneck of memory access is going to rear its ugly face, likely long before other performance issues are even on the radar. This is where much of the diatribe comes from.

It should be noted that pretty much all code bases that make use of the C++ language use classes and structures in some form or another. As long as a programmer has an understanding of memory, how it’s accessed, and what implications arise from given implementations, nothing will go wrong. Alas, actually doing these things and writing good code is super hard. It doesn’t matter if a class has some implementation code within it, so long as that bit of code makes sense for the purposes it is serving.

Implementing Components, a First Draft

The most immediate implementation would be to make use of multiple inheritance. This has a clear definition of where the data goes, and it all goes in one class -the derived class. Multiple inheritance itself can get a bit tricky when dealing with pointer typecasting between derived and base types, though the C++ language itself handles the details much of the time.

Inheritance alone doesn’t provide a good mechanism to query whether a base class is apart of a specific derived aggregation and so the dynamic cast operator is born. Since the dynamic cast is a branching operation, usually implemented (afaik) by inspecting the vtable, it is avoided in general.

Multiple inheritance also does all sorts of work to member function pointers, and is just a sad part of C++. Additionally there isn’t any language feature that allows for dynamic dispatch for combinations of base classes, so if the need arises a custom solution will need to be implemented anyway.

Memory accessing, although defined, isn’t ideal. Multiple inheritance forms a blob of different data, and usually only a single piece of the blob is needed at any given time, meaning locality of reference will be poor in general. This leads to the idea of inheriting from multiple interfaces in order to decouple memory aggregation from functionality aggregation, which leads to the next draft.

Second Draft – Run-Time Aggregation

Instead of using multiple inheritance on interfaces, which is a compile-time feature, run-time support can be added. Object aggregates can be formed during run-time, and modified thereafter. This is appealing for data driven applications, and game-design friendly development iteration speed.

So lets assume that some programmer wants to implement components, but doesn’t think much about memory access patterns the implications therein. Using a vector of pointers an implementation of components becomes super simple. Each pointer can point to an interface exposing a few functions like Update, Init and Shutdown.

Searching for a particular component is as simple as linearly looping over each pointer until a matching type is found. If these pointers are ordered in some way a search can be performed, perhaps a binary search could suffice. If the identifier of a component is hashable a hash table lookup can be used.

The implementation so far is an excellent one except that there is no definition of how memory is allocated and accessed! In the most naive of implementation each game object and each component will be allocated on the heap with separate calls to malloc.

Despite having no clear memory definition there are some nice benefits that have arisen. Data driving the composition of an aggregate becomes quite trivial as each component of an aggregation can have an entirely isolated lifetime. Adding, removing, modifying, or even creating new components at run-time are all now possibilities. This dynamic aggregate architecture is great for improving game development and design iteration time!

Aggregation and Components and the Entity System Paradigm (ES/ECS)

As stated in the definitions section, an ECS is just a specific implementation of a component based architecture. A component based architecture game engine architecture would be a custom implementation of multiple inheritance. A clearly defined ECS can impose restrictions on how a component architecture is implemented and used in hopes of avoided poor memory access patterns, or in hopes of keeping code simple and orderly.

If a component is designed as a piece of memory without any code, and a game object defined as an integer ID then performance specifications can be easily imposed. Rules about where in memory components lay, and how components are actually accessed can be clearly defined in simple terms. Code can be written that operates upon arrays of components, transforming arrays linearly. This idea is actually a type of Data Oriented Design (DOD), which makes sense as DOD is just an idea! ECS is an application of the idea of DOD.

So with this type of implementation the benefits of dynamic composition can be paired with well-defined memory layout and access patterns. Suddenly prefetching and parallelism become much simpler to support.

Aggregatize all the Things!

There’s a problem. Blindly shoving the idea of an ECS implementation into every nook and cranny of an engine during development is just silly (or any complex system, not just game engines or libraries). Often times a particular system is not best implemented with a component or aggregate paradigm in mind.

An obvious case is that of a physics engine. Often times a physics engine developer is worried about collision detection, solving systems of linear equations, rigid body mechanics and allowing the engine to easily be integrated into existing code bases. These details involve a lot of math and good API design. A developer of a physics engine is going to have their focus employed in full force in solving problems specific to physics engines. This means that the engineer’s focus is finite, so the implementation that is best is one that the engineer can actually bring to completion. An implementation that can come to completion is one that makes sense for the specific details of whatever is going on inside the physics engine. The specific paradigms used are often not aggregation or component based!

In order for a physics engine to run fast it needs to have efficient memory access patterns and memory usage, on modern PC hardware, requires some form of DOD. Since this complex (often black boxed) physics engine will have it’s own specific implementation and optimization it doesn’t make sense to force a component based model to its very core with some sort of idealistic zeal. It gets really bad when strict rules are imposed (like banning all code from classes and structures that define components) on the component model (like with an ECS) and the rules start permeating the deep recesses of the entire code base.

The same thing goes for any sort of complex system. The core facilities of a game engine often times just don’t really care about components or aggregation. This means that an engine architecture that implements components will usually have to deal with middleware graphics/physics engines/libraries that don’t subscribe to a component based model (simply because it’s easier to use a library than to write your own custom things, especially if those custom things religiously follow some silly methodology like ECS or even OOP). In practice light wrapper components can be created to let the functionality of such systems be presented in a component format, ready to be used in an aggregate object.

What does this all mean? What should we all do?

Use components where it makes sense in code. Use inheritance where it makes sense in code. Use databases where they make sense. Use all the things where they should. This is a pretty sad answer but it’s the right one. There is no silver bullet paradigm that solves all the problems in the game engine architecture world, and there are no steps to follow to achieve a result that works in all cases. Specific problems require specific solutions. Good code is hard to write, and will require a lot of judgement calls. In order to make good judgement calls a lot of experience and perspective is required.

I recommend using aggregation where it really matters. Dynamic aggregation is important for gameplay specific code. Gameplay specific code, in this article, would refer to code that would not easily apply or work at all in a different game. It’s code that is your game and doesn’t define an isolated system or functionality.

Dynamic aggregation and the component based model are extremely important for game and object editors. Game design flourishes best when iteration times are driven to zero, and the ability to create new things from a composition of fundamentals is very valuable! Clearly composition is useful, but how it’s to be used is the hard part.

What Components to Make?

I recommend making components concerned with providing access to game-independent functionality to be quite large. Every 3D game engine has a concept of a mesh, and will usually have some sort of file format to associate with, like FBX. Every 2D game engine will have the concept of a sprite. Each game using Box2D will have colliders and rigid bodies, and possibly joints. These fundamental pieces of functionality don’t change very often, so static compile-time relationships aren’t a bad thing since iteration time isn’t really all that relevant.

A 3D game might have a single Mesh component for example. A Mesh component can have renderable vertices, and possibly all the skeletal and animation information as well. There may be a single Rigid Body component, which encapsulates the idea of colliders or shapes, as well as the functionality of rigid body mechanics. The Rigid Body component might even contain all necessary code and data to hold multiple joints! Or joints may be a component themselves.

For high level and gameplay related features components can become much more granular (or not if you so choose). Gameplay should be iterated, tested and changed frequently, so having small and decomposed components will probably make a lot of sense in a lot of cases. Large components that encompass more broad ideas will be useful in many cases too. Even in the gameplay world judgement calls are essential.

Usually efficiency isn’t so important for much gameplay code, so any implementation that is decently performant will suffice. Scripting languages, dynamic memory allocation and virtual dispatch, or what have you can all work. The decisions of what requires flexibility, what requires performance and all between can be difficult to make. Please see the references section for some concrete examples.

Further Readings

We live in a world of opinions and it takes time to sift through them! If you have recommendations please comment below :)

Reference Source Code

The best reference I know of is an open source game engine in progress (stalled until I graduate) I myself am developing. Please do send me your recommendations on references!


Component Based Design – Lua Components and Coroutines

Welcome to the second post in a series of blog posts about how to implement a custom game engine in C++. As reference I’ll be using my own open source game engine SEL. Please refer to its source code for implementation details not covered in this article.

I would like to start this post off with a big thank you to Trent Reed for providing great advice in implementing various aspects of Lua integration for a game engine based upon component based design.

If you’re unfamiliar with component based design in general I advise doing a little research before reading the rest.

Since Lua is such an awesome scripting language it would be great if we could give every game object a component whose sole purpose is just to hold some code. This code could be a type of AI brain, or a little bit of game logic.

I always bring up this one particular back and forth patrol AI as a great example. Imagine you could do this in C++:

When this component is updated the code within a script is substituted for C++ code. Imagine if you could write AI code like so:

This enemy patrols left, right and then throws a bomb. This type of code is actually realistic with a simple feature of Lua’s called Coroutines. I am sadly unable to find my original reference for creating a nice C++ Coroutine wrapper, but I do have a nice wrapper within SEL you can look at. UPDATE: Game Programming Gems 5 has the exact article I used when building my own Coroutine implementation. I believe the section is called “Building Lua into Games”.

The entire purpose of a Coroutine is to allow Lua to pause the state of a function call and resume it exactly where it left off from at a later point in time. This lets you write code that can take pauses and resume later, allowing for extremely readable and easy to create game logic.

Creating and using Coroutines is pretty simple and there exist a lot of references on the internet of how to do so. If you like you can view my own implementation within my SEL game engine.

However there does come a time when one actually thinks about how to store these scripts as components in a simple way. As recommended in one of my other posts, your GameObject should look something like this in an engine utilizing Component Based Design:

There rises an issue of storing components whose only difference is a string representing the script name; what if we want to hold any number of these? One simple idea is to create a single LuaComponent type in C++ of which is stored in a slightly different manner; a separate vector of LuaComponents can be utilized to separate the game logic components from the rest of the core engine components.

This allows LuaComponents to accessed via string lookup:

Start Update and Finish

It would be really nice if each component’s script name just referred to a sinle Lua file. If this were true then a naming convention can be established: a Lua component might only be a .lua file that contains functions Start, Update and Finish.

This sounds nice but a method for calling these various functions must be concocted. One simple can’t place all LuaComponent function defintions for Start, Update and Finish into the global Lua environment.

The idea is to create a unique Lua table for each GameObject that contains a LuaComponent. Within this table an isolated Lua environment can be constructed to define the 3 base functions. This also adds the benefit that all global variables within a LuaComponent file are local to each individual LuaComponent instance. This is especially important when you have lots of LuaComponents of a single script type. You don’t want globals in the LuaComponent file to be shared between all instances.

Implementing this is pretty easy if your game objects already have unique identifiers (preferable integers). I’ll take a slight detour on unique ids for a moment.

Unique Object IDs

The simplest way to implement unique IDs for your game objects is to keep track of a single integer. This integer starts at 0, and each time a new object is created the integer is incremented after assigning the object’s ID as the value of the integer.

This works so long as your integer overflow is very high. Luckily 32 bits of precision is more than enough for any game.

This can be taken farther with handles as detailed in one of my other posts.

Implementing Script Environments

Given a unique id for a game object a table in Lua can be constructed especially for this object:

The idea here is to create a new table instance for a game object if no table already exists. Then a new environment is created within this table (Environments in Lua are just tables).

The next step is to somehow get our .lua file definitions into this environment. In Lua 5.1 (and some lesser versions) there is a nice setfenv function which sets the environment of a Lua function. This is perfect for our cause as files loaded from .lua files are made into chunks, which are just nameless function objects! All that needs be done is to load the script and set it’s environment to the fresh new environment given to our object instance, and run the loaded chunk.

In Lua 5.2 and beyond there’s no nice setfenv function. Instead we must change the first upvalue of the chunk, which is the environment of the chunk. There are a couple ways to do this and I ended up choosing the easiest to implement. Here is my finished loader in Lua:

I decided to make use of the debug library. This allows me to inject a chunk’s definitions into an environment without fetching data from file. First implementations are likely to make use of lua’s loadfile function, as it actually does have a parameter to specify an environment. However loading from file is really slow, so ideally one would just keep a reference to the loaded chunk and run it on different environments as needed.

Coroutines, or Not?

I myself haven’t experienced this, but around the internet and through word of mouth I’ve heard that coroutines aren’t as fast as we’d all like them to be. This is too bad, but can be dealt with. One way is with recycling of coroutines. I will likely mess with this myself later (when I need to), but I haven’t yet found it necessary.

Instead my LauComponents in SEL contain a boolean to determine if the component will be run as a coroutine or normal Lua function call. A normal Lua function cannot have fancy WaitSeconds or WaitFrames calls, but it is still Lua. This way developers can have control over the amount of overhead a given LuaComponent actually  imposes.

Lua File System

Since I wrote about loading Lua chunks and holding them for later use, it would be helpful to know how to load all files from a folder in Lua. This lets you drop a .lua file into a specified folder and suddenly your engine has access to a new LuaComponent type. This can be coupled with asset hot-loading! There’s a great article by Noel Llopis on asset hot-loading in Game Programming Gems 6.

I highly recommend using Lua File System (LFS). LFS is extremely small and has the same license as Lua itself. This is great in case you need to modify the source. It’s also extremely useful.

I recommend compiling LFS into Lua itself, whether or not you’re making a dynamic library or static library. I had good results doing this  myself.

Here’s an example of some of my code used to load all scripts within a folder (traverse all sub-folders recursively):

There’s great support for querying file extensions, names, paths and differentiating between files and folders.

I actually use LFS as my standard file directory traversal tool in general, not just for LuaComponents.


Hopefully this article provides some insight into creating dynamic game logic components with Lua! If I wrote this correctly someone out there is excited to try Coroutines with Component Based Architecture.

Spent a lot of time creating Maya-like camera controls with orbiting.

Protip: How to be a Physics Dev – Debug Rendering

Today I figured out the single most important tip I could ever give anyone in regards to being a competent physics engine developer. I will share this tip right here on my humble blog:

Your skills as a physics developer will grow at a rate proportional to how good your debug draw information is. I’d actually argue that your skills as a physics developer are proportional to how good your debug draw is.

I feel I’m at a decent place in my skills, as I feel my debug draw has become decent. Lets get some screenshots going!

Initial simplex during Quick Hull.

Initial simplex during Quick Hull.

Stack of polyhedron (OBB meshes). The blue and red contacts are warm started (red is warm started consistently over many frames).

Stack of polyhedron (OBB meshes). The blue and red contacts are warm started (red is warm started consistently over many frames).

Spent a lot of time creating Maya-like camera controls with orbiting.

Spent a lot of time creating Maya-like camera controls with orbiting.

I digress. I didn’t figure out this simple fact on my own; I gleaned this wisdom off of my friend Nathan Carlson. He pointed out that the reason a lot of students at DigiPen have trouble with computational geometry and general physics engine development due to a lack of debug rendering.

Just take a look at Dirk Gregorius’s sample demo from his GDC 2013 lecture. He fully renders Gauss maps, a minkoswki difference, and has glorius camera controls. I actually copied his camera design and implemented it myself. You can see that he has Ant Tweakbar integrated as well, for various tweaking tasks. Erin Catto has lots and lots of different tests, asserts, and most important of all tons of debug drawing for his Box2D library. The Zero team on the third floor of DigiPen definitely has the craziest debug draw I’ve ever heard of!

Hopefully I can achieve something at least in the likeness of all these other developers, and I recommend readers do the same; intuition is often best built visually when it comes to geometry.

Custom Physics Engine – Part 1: Impulse Resolution


This series ended up on Tuts+, and these are sort of deprecated. Please visit Tuts+ to see the finalized articles.

Personal Update

Beyond C++ reflection I’ve taken a huge liking to physics programming for games. Reflection in C++ is a largely “solved” area of study. Though resources on the internet may be few and far between for creating custom introspection, people that know how to create such systems have explored much of what is to be accomplished.

However physics is another story entirely. Up until just a few years ago physics for games was in a highly primitive state; techniques just hadn’t been created due to the inability of hardware to perform necessary computation. There are a lot of new and exciting problems to solve.

So this next year while attending DigiPen I’ll be focusing my studies around: engine architecture, data oriented design and optimization, C++ introspection and game physics. This last topic being the topic I’m going to start an article series on.

Game Physics High Level Detail

A game physics engine performs some functionality for ensuring that shapes behave in a particular way. To state this one could say: a physics engine removes degrees of freedom from moving bodies, and enforces these imposed rules.

From here on out, until specified otherwise I’ll be talking specifically about convex rigid bodies within a physics engine. A rigid body is just a shape that is implicitly rigid; the engine does not naturally support deformation of a set of points that make up a shape. Using this implicit definition the mathematics behind manipulating such shapes can be known intuitively and implemented efficiently.

Here’s a short feature list of some features a custom physics system may provide, assuming the engine uses rigid bodies:

  • Collision detection
  • Collision resolution
  • Linear movement, angular rotation and integration
  • Raycasting
  • Islanding/sleeping/portals
  • Friction simulation
  • Spatial partitioning/bounding volumes
  • Fracturing/splitting
  • Multi-part bodies
  • Various types of shapes
  • Advanced mathematical constraints (joints, motors, contact constraints, etc.)
Raycasting from Box2D.Raycasting from Box2D.

In this article series I will attempt to talk about all of the above topics in my own time. I’ve learned a lot about physics and physics engines in a short amount of time and by documenting what I have learned I hope to solidify my understanding, as well as help out others to achieve what I have. I would not have learned nearly as much as I currently know without help from others so it is only natural to want to do the same.

The best example of an open source physics engine that employs much of the feature list shown above would be Box2D by Erin Catto. The rest of this article series will be detailing the physics engine that I myself have written. There are of course other methods I choose not to talk about, or just don’t know about.


There are two main objects that make up a physics engine: shapes and bodies. A body is a small package of data that defines intrinsic properties of a physics object. Properties such as density, restitution (elasticity), friction coefficients, inertia tensors, along with any other flags or settings should be stored within the body. These bits of data are properties that can be isolated away from the shape definition. The shape itself is contained with the body through a pointer.

The shape definition defines what sort of shape a physics object represents. Some physics engines can attach multiple shapes to a body, which are referred to as fixtures. A shape stores any data required to define the shape itself, and provides various functions to operate on the shape, such as testing for line or point intersection, or generating a bounding volume.

Together the body and shape represent a physics object, which by the end of this article series will be able to perform a lot of interesting interactions with other physics objects.

A tower of stacked oriented boxes within a custom physics engine.

A tower of stacked oriented boxes within a custom physics engine called iiD.

All bodies should be contained within what is known as a scene or world. I refer to this object as a scene. The scene should contain a list of all live objects, as well as functionality inserting or removing bodies from the scene. The scene should also have callbacks for processing shape or ray queries. A query just checks to see if any bodies intersect with something like a point or ray.

The scene has one particular function called step, which steps the scene forward in time by a single delta time (dt). This step function steps all objects forward in time by integration. The integration step just moves the objects forward by using their velocity, position and acceleration to determine their next position.

During the step collisions are detected and then resolved. Often times a broadphase of some sort is used to detect possible collisions, and expensive collisions operations are only used when really needed.

All of this organization allows the user of your physics engine to worry about three main operations: creating and removing bodies, and stepping the scene. The rest is automated and taken care of within the physic system’s internals.

The last major isolated system would be the broadphase. There are two major phases in collision detection: the broad and narrow phases. The narrow phase is the one which determines if two shapes intersect or not. The broad phase determines if two shapes can possibly be intersecting or not. The broadphase should be constructed such that intersection queries are very very fast and simple. An axis-aligned bounding box (AABB) will suffice perfectly.

Once the broadphase chooses which objects could perhaps be colliding, it sends them off to the narrow phase. The narrow phase performs much more intensive operations to determine if two objects are colliding or not. The whole point of this is to reduce the amount of times the narrow phase has to be used, as it is expensive to operate with.

Once the narrow phase finds a pair of bodies to be colliding information about the collision is gathered and placed into a small container called the manifold. Do not ask why it is called a manifold, for I have no idea and neither does anyone else seem to! The manifold contains three important pieces of information:

  • Collision penetration depth
  • Direction to resolve in
  • Points of contact

These pieces of information are used to fill in formulas that are used to resolve the penetration and solve for a single unknown: the magnitude of the resolution vector. Here’s a small diagram:


c - Collision contact n - Resolution normal d - Penetration distance

c – Collision contact
n – Resolution normal
d – Penetration distance

It is also useful to to store pointers or handles to the two objects that formed this collision info. This allows some useful functions to placed into the manifold object directly: solve and resolve. Solve would be the function to collect the three pieces of collision information. If no contact points are found, then there is no collision. If contact points are found, then resolving performs a resolution operation to force both objects to not be intersecting after integration.


Complex physics manipulation is performed on the velocities of objects. Trying to manipulate the positions of objects directly is very difficult, as it poses a problem that isn’t linear. By using derived position equations for velocity, the problem is thus simplified. Most of the time we will be only dealing with velocity manipulation.

Impulse Resolution in 2D (No Rotation or Friction)

The act of resolving collisions is something that isn’t covered very well on the internet. There are some scattered documents and details, though the information isn’t always easy to find. Most of what I know I learned by talking with other people, but I know most people will not have such an opportunity. Hopefully this article series can provide a strong resource for understanding and constructing a simple physics engine.

The toughest place to code is in my opinion resolution of collision. There exists tons of information on collision detection and broadphase, and thus creating these portions of a physics engine is in my opinion not too difficult. Some resources for collision detection are: Real-Time Collision Detection by Christer Ericson, and Game Physics Engine Developement by Ian Millington. I have both of these books right next to me as I write this :)

Generating contact manifolds and resolving such manifolds are what most programmers get caught up in. So lets hit the ground running and tackle a portion of code that will bring your entire physics system to life: contact resolution.

The best type of contact resolution to start with is impulse resolution. The idea behind impulse resolution is to take your contact manifold and solve for a single velocity vector that can be used to add into an object’s current velocity, in order to make it not penetrating the other object during the next frame. The reason for starting with impulse resolution is that it’s quite simple and easy to get up and running, and more complicated and advanced techniques require you to understand impulse resolution anyway.

2D impulse simulation of AABBs without rotation.

2D impulse simulation.

Now the contact manifold is assumed to contain the direction of our velocity vector we are solving for. I will cover how to generate the contact manifold in another article in this series. The only unknown left to solve for is the magnitude of this vector. It so happens that it’s possible to solve for this magnitude in relation to both objects. If this magnitude is known, you add velocity along the normal scaled by the magnitude for one object, and you do the same operation to the other object in the direction opposite to the manifold normal. Lets start from the top.

We have two objects moving along in our scene, and there exists a relative velocity between the two, with object A as the reference object at the origin:

The relative velocity can be expressed in terms of the collision normal (from the collision manifold) with a dot product:

This can be thought of as the relative normal velocity between the two objects. The next thing to include in this derivation is the coefficient of restitution (elasticity factor). Newton’s Law of Restitution states that:

Often times the restitution identifier is specified by an e, or epsilon symbol. Knowing this it’s fairly simple to include it within our current equation:

Now we need to go to another topic and model an impulse. I have said the term “impulse” quite a few times without defining it, so here is the definition I use: an impulse is an instantaneous change in velocity. We will use an impulse to change the velocity of an object. Here’s how you could use an impulse to modify the velocity of a single object:

Here Impulse would be a scalar floating point value. This isn’t too useful however, as it only scales an object’s current velocity, and thusly makes the object move slower or faster along the positive or negative direction of the vector.

What is needed is a way to do component-wise modification of the vector, so we can make it point slightly in one direction or another, allowing objects to make turns and change directions.

In the above equation we can take a direction vector with a magnitude of 1, and scale it by our impulse. By adding this new vector to our velocity we can then modify the velocity in any way we wish. Our end goal is to solve for our Impulse scalar that will separate two objects from collision, so in the end we’ll need to distribute this scalar across two equations in terms of velocity.

Lets start with a simple momentum equation:

An impulse is defined to be a change in momentum. Thus we get:

To isolate our velocity after we can rearrange into:

Now lets change equation 4 into one that contains velocities under the influence of impulses. However we’ll want to express our VelocityNew as one that is acted upon by impulse, and substitute in equation 9:

Remember that the impulse is a scalar. Also note that all values on the right hand side of the equation are all known, including the Direction which was solved for by the collision detection.

All that is left here is to distribute this scalar impulse over each object’s velocity vector proportional to the total mass of the system (system being collision between both objects).

The total mass is MassA + MassB, so to get an even distribution you do: impulse * Mass / TotalMass. To simplify this one could use the following:

This can be done twice, once per object. The total impulse will be applied, except only a portion of the impulse will be applied to each object. This ensures smaller objects move more than larger ones during impact.

One thing you must ensure is that if the velocities are separating (objects moving away from one another) that you do nothing. Here’s a sample version of finalized code for impulse resolution in 2D without rotation or friction:


Now that we have covered resolution without these two factors adding them in to our final equation (equation 10) will be quite a bit simpler. We will need to understand the concept of “mass” in terms of rotations. The inertia tensor of an object describes how “hard it is to turn” along some axis. In 2D there’s only a single axis of rotation (along the z) so a single tensor is all that’s needed. Here’s a new version of equation 11.

For the sake of brevity here’s the final equation, where r is the vector from center of mass to a point on the body (contact point). The velocity of a point on the body relative to it’s center is given by:

Final equation (Direction substituted for n):

And there we have it! This will solve for a separating impulse given a specific contact point. Now you might have noticed there are a couple cross products that are kinda strange in 2D. I won’t cover what they are here, but they still hold true. I’ll just give them to you:


Friction is the simplest thing to do in this entire resolution article, assuming we’re dealing with 2D! I actually haven’t derived this portion myself, so again I’m going to just throw the equations at you. Assuming we have our resolution direction in the manifold (the collision normal), we can calculate a tangent vector, which is a vector perpendicular to the collision normal. Just replace all instances of n in equation 14 with this tangent vector.

To solve for the tangent vector you do:

Again, just take the above TangentVector and replace it in equation 14 for n.

There is something I have missed. When solving for the force of friction there’s a max, which is the manifold normal * the coefficient of friction (from the body definition). This is due to the Coloumb friction law. Treat the coefficient of friction * normal as your “friction cap” when solving for your tangential impulse. This is just a simple capping of the final impulse vector.

Penetration Resolution

Due to floating point error energy will always be lost from the system over time. As such, when objects start coming to a rest they will sink into each other due to gravity. How to solve? Well the solution is to actually just handle the leftover penetration manually by just moving the objects apart directly. This will prevent objects from sinking into one another, though doesn’t add any energy into the system.

To resolve penetration, simply move each object along the manifold normal in opposite directions. However there’s an art to doing so; you need to be gentle. If you just resolve 100% of the penetration each frame objects underneath other objects will jitter violently due to the naive penetration resolution scheme I am presenting. To avoid this, try only resolving a percentage of the leftover penetration, perhaps 20 to 80%.

Additionally you only want to resolve penetration if the penetration depth after the impulse is applied is above some constant arbitrary (but small) threshold, (try 0.01 to 0.1). If the penetration is below this, then don’t move either object.

This method of penetration resolution is called linear projection. Here’s a snippet of C++ code demonstrating this:

Iterative Solving

There is one one additional tweak you can do to increase the believability of your physics simulation: iterate over all contacts and solve + resolve the impulses many times (perhaps 5 to 20 iterations). Since the large equation 14 has relative velocity within it, each iteration will feed in the previous result to come up with a new one.


This will allow the step to propagate energy throughout multiple objects contacting one another within a single timestep. This is essential for allowing the “billiards balls” effect to ensue.

This simple iteration is a very easy way to vastly improve the results of resolution.



Game Programming Primer

I’ve spent quite a bit of time writing a fairly long article over at a website I frequent TeamLiquid.

Now it’s time to share the article in finished form here for all to see! The article is called Game Programming Primer, and here’s an excerpt from the introduction detailing what the article is all about:

Hello! My name is Randy Gaul, and I am a computer science student. I study at DigiPen Institute of Technology, and am majoring in Real-Time Interactive Simulation (a fancy way of saying game programming). I’d like to share my know-how in the ways of game programming as a profession for anyone interested in learning. I encourage anyone interested in programming or creating games, no matter how little knowledge you have in either topic, to check out this article.

And here’s the finished article:

Download (PDF, Unknown)