Category Archives: C++

Automated Lua Binding

Capture

Welcome to the fifth post in a series of blog posts about how to implement a custom game engine in C++. As reference I’ll be using my own open source game engine SEL. Please refer to its source code for implementation details not covered in this article. The folder of interest would be LuaInterface.


Introduction

Binding things to Lua is twofold: objects and functions must be able to be sent to and retrieved from Lua. Functions can be either static C or struct/class methods. Objects can be sent “by value” or “by reference”. As you can imagine it is important to be able to unify and simplify the binding process as much as possible to reduce all manual dev-work and upkeep.

Generic C++ Functor

As with many things in a modern C++ game engine it is critical to have a generic C++ functor. Ideally this functor can wrap around class/struct methods (not only static functions). It is also possible have this functor able to refer to a function within Lua as well.

Please see my article and slides on C++ Function Binding for implementation details not covered here.

Prerequisites

This article is on the topic of automatic Lua binding; if you’re unfamiliar with how to bind simple C functions to Lua please do a little research and come back later. The deep end of the pool is actually pretty deep!

I also suggest a working knowledge of C++ templates before trying to implement these sort of features. A working knowledge of Lua is also essential.

Setting the Boundaries

With a scripting language it’s important to clearly define what you want to expose to script. Is the entire game in Lua? Are only specific parts accessible? What are the boundaries. It’s all too easy to get very caught up in what to send, what to implement, what not to do. Having clear boundaries of exactly what you want to do is the best way to start coding.

Passing Objects to Lua

Objects can be passed to Lua by reference and value. A reference would consist of 4 bytes of memory to contain a pointer to some C++ memory. This allows Lua to store a “reference” to an object in C++. Most of the work involved in this type of object binding is in allow Lua to call C++ methods or functions on the pointer its storing.

The benefits of this approach are such that: calling class methods is pretty fast and shouldn’t be a worry; fairly simple to implement as most of the work is finished by creating a generic functor in C++; no hassle or upkeep when wanting to send new types of objects to Lua -each object is just a 4 byte pointer.

Passing by Reference with lightuserdata

There are two ways I’d recommend to pass an object to Lua with: userdata and lightuserdata. A lightuserdata represents a void * in Lua and can hold a reference to an object in C++.

Here’s how one might send and retrieve lightuserdata from Lua:

This method is very fast, simple to implement and has very minimal memory overhead. Additionally lightuserdata can be compared to one another, and are equal if the underlying address is equivalent. However, one cannot attach metatables to lightuserdata and there is no sense of type safety what so ever. A lack of type safety means that if someone passes a lightuserdata into an incorrect C function the host program will likely crash.

With lightuserdata the following code is possible:

This solution will work for one, maybe two people working on a smaller project or minimal amount of code. I can imagine that the lack of type safety will be the biggest issue as time goes on.

Reflection for Type Safety

It is possible to implement type-safety in Lua. However this requires Lua code to be maintaining type information. Lua is a scripting language meaning it ought best be used to script things. Something so integral and common as type-safety might better be implemented in lower-level C++ code.

Implementing type safety on the C++ side has two benefits: efficiency of implementation; type-safety can optionally be compiled away in release mode.

I highly recommend building yourself a simple, custom introspection library in C++. All that is really needed to start is the ability to query a type’s size and name efficiently. Please see my older article on custom Introspection or the game engine SEL for examples on how to implement such a system.

With a simple macro-based registration system one can register and lookup type information via introspection like so:

After this is complete and working (if you don’t have an implementation of introspection yet this is fine, just think of it as a black box) a small generic Variable object ought to be created. Sample code of a functional Variable object is in this post.

A Variable can be used like so:

It is important to note that the Variable itself is not a templated type!

When passing an object to Lua we can send a pointer to a Variable. As long as the Variable exist in memory in C++ the lightuserdata within Lua will point to a valid Variable. Upon retrieval of the Lua object back to C++ a type assertion can be run:

Generic Static Function Binding

Bind C-style static functions in a generic way makes heavy use of custom introspection. The way I was originally taught was to just throw the entire binding function (in C++) at you all at once and let you suffer. Prepare to suffer as I did!

This function isn’t doing the bind, it’s what is bound. Every time a function in C++ is called from Lua, this function is called first.

An upvalue in Lua is akin to static variables in C. Using this we can attach a pointer to a generic functor to a bound C function within Lua. As Lua calls a C function this upvalue is retrieved and eventually used to actually call the C function.

The rest is just a matter of handling variables to/from Lua. In the above example the Variable object contains some helper functions call ToLua and FromLua. The nice thing about my implementation of this within SEL is that no heap memory is used during this entire process! All this code boils down to a very efficient method of generically calling C functions.

I will leave binding C++ methods as an exercise for the reader. By now you ought to have an idea of where to look for example implementation! The idea is to handle type information for the “this pointer” of the method, and pass around an actual “this pointer” to call the method.

Calling Methods from Lua

Lets say you have an implementation that allows Lua code like the following:

A few things need to happen here. The first is that the object in question should only call methods that are actually methods of that specific type of class; one cannot simply bind all C++ methods and place functions in Lua within the global scope. Any object type could call any method type making for a lack of type-safety and dangerous code.

At this point the lightuserdata will have to be upgraded to a full userdata. Full userdata in Lua enjoy benefits such as the ability to set and modify metatables. If you’re not familiar with Lua metatables please do a little research on the topic and come back later.

A full userdata allows us to place a copy of a Variable within Lua memory, instead of just a void *. This means a temporary Variable can be used to call ToLua, instead of requiring that the Variable sent stays valid in C++ for the duration of usage within Lua.

Currently a way to create metatables for all of our C++ types is required. Assuming a linked list of all TypeInfo objects from the introspection system is available:

This loop is just creating metatables given the string names of what each metatable should be called.

After the tables are created the actual C++ methods and functions should be bound. This turns out to be really simple! It is assumed that each function and method registered within the introspection system can be passed to the function at some point (perhaps during registration of the type information):

And that’s really all there is to it! The idea here is to make sure that a type with methods sent to Lua has its userdata fixed with a metatable containing the available methods to call. When the __index metamethod is called it will search within the metatable itself for an appropriate member. Members of the metatable are the functions we bound to Lua. After they are fetched they can be called. This is what happens behind the scenes when we do:

Passing Object by Value

Passing objects by value is actually much more difficult. The idea is to utilize tables to to store representations of the members associated with a class or struct. A table can be used to represent state of an object.

The __index and __newindex metamethods of a userdata should be set to look into the state table first. This lets users assign new values, and lets your ToLua and FromLua functions copy members from C++ to/from this Lua state table.

If a member is not found in the state table the metatable can then be searched by setting the __index metamethod of the state table to refer to the proper metatable.

All of this table indirection does incur significant overhead, however it allows objects in Lua to be used like so:

I myself have not implemented this type of Lua binding, though it is entirely possible and can be quite nice to work with. I reiterate that adding this many tables incurs both memory and performance overhead not seen with the other styles. This seems to be the only drawback.

Conclusion

Well this post turned out longer than I expected -over 2k words! Hopefully the information was clear. It’s really nice being able to refer people to a complete and working example such as the SEL engine; it makes writing articles much easier and simpler.

Hopefully this can help someone out there! As always feel free to ask questions or provide comments right here on this page.

Please see Game Programming Gems 6 ch. 4.2 for more information about binding C++ objects to Lua,

Simple Sprite Batching

Capture

Welcome to the fourth post in a series of blog posts about how to implement a custom game engine in C++. As reference I’ll be using my own open source game engine SEL. Please refer to its source code for implementation details not covered in this article. Files of interest are Graphics.cpp and Graphics.h.


Batching and Batches

Did I say “simple” sprite batching? I meant dead simple!

Modern graphics card drivers (except Mantle???) do a lot of stuff, and it makes for passing information over to the GPU (or retrieving it) really slow on the PC platform. Apparently this is more of a non-issue on consoles, but meh we’re not all working with consoles now are we?

The solution: only send data one time. It’s latency that kills performance and not so much the amount of data. This means that we’ll do better at utilizing a GPU if we send a lot of data all at once as opposed to sending many smaller chunks.

This is where “batching” comes in. A batch can be thought of as a function call. In OpenGL you’ll see something like DrawArrays, and in DirectX something else. These types of functions send chunks of data to the GPU in a “batch”

Sprites and 2D

Luckily it’s really easy to draw sprites in 2D: you can use a static quad buffer and instance by sending transforms to the GPU, or you can precompute transformed quads on the CPU and pass along a large vertex  buffer, or anything in-between.

However computing the batches is slightly trickier. For now lets assume we have sprites with different textures and different zOrders.

Computing Batches

In order to send a batch to the GPU we must only draw sprites with the same texture. This is because we can only render instances (or a large vertex array) with a given texture in order to lower draw calls. So we must gather up all sprites with the same texture and render in the correct order according to their zOrders.

If you are able to store your sprites in pod-structures then you’ll be in luck: you can in-place sort a large array of pods really easily using std::sort. If not, then you can at least make an array of pointers or handles and sort those. You’ll have extra indirection, but so be it.

Using std::sort requires STL compatible iterators, and you’ll want one with random access  (array index-style access). Here’s an example with a std::vector:

The sort implementation within your package of the STL is likely going to be quicksort.

This sort will sort by zOrder first, and if zOrders are matching then sort by texture. This packs a lot of the sprites with similar textures next to each other in the draw list.

From here it’s just a matter of looping over the sprite array and finding the beginning/end of each segment with the same texture.

Optimizations

A few simple operations can be done here to ensure that computing batches goes as fast as possible. The first is to get all of your sprites together in a single array and sort the array in-place. This is easily done if your sprites are mere pods. This ensures very high locality of reference when transforming the sprite array.

The second optimization is to only sort the sprite array when it has been modified. If a new sprite is added to the list you can sort the whole thing. However there is no need to sort the draw list (sprite array) every single frame if no changes are detected.

Conclusion

Like I said, sprite batching is super simple in 2D. It can get much more complex if you add in texture atlasing into the mix. If you wish to see an OpenGL example please see the SEL source code.

I was able to render well over 8k dynamic sprites on-screen in my own preliminary tests. I believe I actually ended up being fill-rate bound instead of anything else. This is much more than necessary for any game I plan on creating.

Component Based Design – Lua Components and Coroutines

logo

Welcome to the second post in a series of blog posts about how to implement a custom game engine in C++. As reference I’ll be using my own open source game engine SEL. Please refer to its source code for implementation details not covered in this article.


I would like to start this post off with a big thank you to Trent Reed for providing great advice in implementing various aspects of Lua integration for a game engine based upon component based design.

If you’re unfamiliar with component based design in general I advise doing a little research before reading the rest.

Since Lua is such an awesome scripting language it would be great if we could give every game object a component whose sole purpose is just to hold some code. This code could be a type of AI brain, or a little bit of game logic.

I always bring up this one particular back and forth patrol AI as a great example. Imagine you could do this in C++:

When this component is updated the code within a script is substituted for C++ code. Imagine if you could write AI code like so:

This enemy patrols left, right and then throws a bomb. This type of code is actually realistic with a simple feature of Lua’s called Coroutines. I am sadly unable to find my original reference for creating a nice C++ Coroutine wrapper, but I do have a nice wrapper within SEL you can look at. UPDATE: Game Programming Gems 5 has the exact article I used when building my own Coroutine implementation. I believe the section is called “Building Lua into Games”.

The entire purpose of a Coroutine is to allow Lua to pause the state of a function call and resume it exactly where it left off from at a later point in time. This lets you write code that can take pauses and resume later, allowing for extremely readable and easy to create game logic.

Creating and using Coroutines is pretty simple and there exist a lot of references on the internet of how to do so. If you like you can view my own implementation within my SEL game engine.

However there does come a time when one actually thinks about how to store these scripts as components in a simple way. As recommended in one of my other posts, your GameObject should look something like this in an engine utilizing Component Based Design:

There rises an issue of storing components whose only difference is a string representing the script name; what if we want to hold any number of these? One simple idea is to create a single LuaComponent type in C++ of which is stored in a slightly different manner; a separate vector of LuaComponents can be utilized to separate the game logic components from the rest of the core engine components.

This allows LuaComponents to accessed via string lookup:

Start Update and Finish

It would be really nice if each component’s script name just referred to a sinle Lua file. If this were true then a naming convention can be established: a Lua component might only be a .lua file that contains functions Start, Update and Finish.

This sounds nice but a method for calling these various functions must be concocted. One simple can’t place all LuaComponent function defintions for Start, Update and Finish into the global Lua environment.

The idea is to create a unique Lua table for each GameObject that contains a LuaComponent. Within this table an isolated Lua environment can be constructed to define the 3 base functions. This also adds the benefit that all global variables within a LuaComponent file are local to each individual LuaComponent instance. This is especially important when you have lots of LuaComponents of a single script type. You don’t want globals in the LuaComponent file to be shared between all instances.

Implementing this is pretty easy if your game objects already have unique identifiers (preferable integers). I’ll take a slight detour on unique ids for a moment.

Unique Object IDs

The simplest way to implement unique IDs for your game objects is to keep track of a single integer. This integer starts at 0, and each time a new object is created the integer is incremented after assigning the object’s ID as the value of the integer.

This works so long as your integer overflow is very high. Luckily 32 bits of precision is more than enough for any game.

This can be taken farther with handles as detailed in one of my other posts.

Implementing Script Environments

Given a unique id for a game object a table in Lua can be constructed especially for this object:

The idea here is to create a new table instance for a game object if no table already exists. Then a new environment is created within this table (Environments in Lua are just tables).

The next step is to somehow get our .lua file definitions into this environment. In Lua 5.1 (and some lesser versions) there is a nice setfenv function which sets the environment of a Lua function. This is perfect for our cause as files loaded from .lua files are made into chunks, which are just nameless function objects! All that needs be done is to load the script and set it’s environment to the fresh new environment given to our object instance, and run the loaded chunk.

In Lua 5.2 and beyond there’s no nice setfenv function. Instead we must change the first upvalue of the chunk, which is the environment of the chunk. There are a couple ways to do this and I ended up choosing the easiest to implement. Here is my finished loader in Lua:

I decided to make use of the debug library. This allows me to inject a chunk’s definitions into an environment without fetching data from file. First implementations are likely to make use of lua’s loadfile function, as it actually does have a parameter to specify an environment. However loading from file is really slow, so ideally one would just keep a reference to the loaded chunk and run it on different environments as needed.

Coroutines, or Not?

I myself haven’t experienced this, but around the internet and through word of mouth I’ve heard that coroutines aren’t as fast as we’d all like them to be. This is too bad, but can be dealt with. One way is with recycling of coroutines. I will likely mess with this myself later (when I need to), but I haven’t yet found it necessary.

Instead my LauComponents in SEL contain a boolean to determine if the component will be run as a coroutine or normal Lua function call. A normal Lua function cannot have fancy WaitSeconds or WaitFrames calls, but it is still Lua. This way developers can have control over the amount of overhead a given LuaComponent actually  imposes.

Lua File System

Since I wrote about loading Lua chunks and holding them for later use, it would be helpful to know how to load all files from a folder in Lua. This lets you drop a .lua file into a specified folder and suddenly your engine has access to a new LuaComponent type. This can be coupled with asset hot-loading! There’s a great article by Noel Llopis on asset hot-loading in Game Programming Gems 6.

I highly recommend using Lua File System (LFS). LFS is extremely small and has the same license as Lua itself. This is great in case you need to modify the source. It’s also extremely useful.

I recommend compiling LFS into Lua itself, whether or not you’re making a dynamic library or static library. I had good results doing this  myself.

Here’s an example of some of my code used to load all scripts within a folder (traverse all sub-folders recursively):

There’s great support for querying file extensions, names, paths and differentiating between files and folders.

I actually use LFS as my standard file directory traversal tool in general, not just for LuaComponents.

Adieu

Hopefully this article provides some insight into creating dynamic game logic components with Lua! If I wrote this correctly someone out there is excited to try Coroutines with Component Based Architecture.

C++ Function Binding

Capture

Welcome to the first post in a series of blog posts about how to implement a custom game engine in C++. As reference I’ll be using my own open source game engine SEL. Please refer to its source code for implementation details not covered in this article.

I would like to thank John Edwards for his contribution to my education in the areas of reflection and function binding. You can thank him too by checking out their games at thatgamecompany!


Function binding in C++ is the act of being able to trigger a function given some form of input. Usually this applies to C or C++ by means of calling any function in a generic way. This can be achieved easily during compile-time in C++ by using some templates along with decltype. This is useful for:

  • Script binding
  • Advanced messaging
  • Advanced editor support
  • Many others

The idea is to capture the pointer to a function (or method) and pass its type around in code as a template parameter. An instance of the template type is created as a template constant. This is possible with the usual compilers as function pointers are of integral type.

The rest of the work involves creating a nice wrapper to pack arguments together and get them to the template constant pointer in a generic fashion.

I’ve created some nice slides on the topic and some demo source code. There is one slide with a video that will not play from the pdf, I attached the video to the bottom of this post. Hope this helps someone out there.

Download (PDF, Unknown)

Source code demo is here. If you wish to view a fully featured example, please see my project SEL.

Live Enumeration Editing in C++

Taron Millet, the programmer for Volgarr the Viking, created an interesting enumeration editor for their game editor used in the creation of Volgarr the Viking. This enumeration editor sparked my interest as somehow enumerations could contain within them an enumeration type. This forms a sort of tree hierarchy of enumerations! I actually emailed Taron about the editor, and he threw together a quick demo for me! If you’d like to see the demo just email me and I can send it to you.

Imagine you have an enumeration of types of items, things like breast plates, helmets, boots. Now imagine within each enumeration, lies another enumeration. You can enumerate types of helmets, types of boots and types of breast plates. Now imagine that this tree-like hierarchy is recursive with no depth boundary!

Not only was this enumeration tree really cool, but it also could be live-editted and commit back to C++ code. This is a very interesting idea and can be applied to custom editors for C++ game engines.

I’ve created my own terminal enumeration editor for a proof of concept. Here’s a video demo:

This sort of editor could be implemented in a fully featured editor, perhaps like the one Volgarr the Viking used! This is great for quick changes in gameplay and the like, and can greatly reduce the time required to setup type-safe enumerations. I myself use this editor to also reflect all constructed enumerations within a custom C++ introspection database. This allows all enumeration types to be passed to/from scripting languages, and serialized.

The implementation of such is actually super simple, and a proof of concept can be seen here: https://github.com/RandyGaul/Serialization_C. The idea is to use a single data file full of macro calls. This data file is then intentionally imported into multiple locations. Each time this import occurs different definitions of the macros are defined, thus interpreting the data in various ways upon each import. For more information about this see the link within this paragraph.

Powerful C++ Messaging

A prerequisite to this information is most of the previous C++ type introspection stuff I have been writing about for a while now. Assuming the previous information has been covered, lets move on:

There exists a design of messaging, specifically for C++, of which has minimal downsides and many positive advantages. Ideally messaging should not involve any polling or implicitly required searching (as in searching through game space to see who to message, which requires expensive collision queries). It should also have a very intuitive usage, and not be very complex to work with.

If such a messaging system can be achieved then inter-object communication can be setup, to create game logic, within a scripting language.

Here are some slides I wrote on this topic for my university, but are available for public viewing:

Download (PDF, Unknown)

C++ Reflection Part 6: Lua Binding

Binding C/C++ functions to Lua is a tedious, error prone and time consuming task when done by hand. A custom C++ introspection system can aide in the automation of binding any callable C or C++ function or method a breeze. Once such a functor-like object exists the act of binding a function to Lua can look like this, as seen in a CPP file:

The advantage of such a scheme is that only a single CPP file would need to be modified in order to expose new functionality to Lua, allowing for efficient pipe-lining of development cycles.

Another advantage of this powerful functor is that communication and game logic can be quickly be created in a script, loaded from a text file, or even setup through a visual editor. Here is a quick example of what might be possible with a good scripting language:

In the above example a simple enemy is supposed to follow some target object. If the target is close enough then the enemy damages it. If the target dies, the enemy flashes a bright color and then acquires a new target.

The key here is the message subscription within the initialization routine. During run-time objects can subscribe to know about messages emitted by any other object!

So by now hopefully one would have seen enough explanation of function binding to understand how powerful it is. I’ve written some slides on the topic available in PDF format here (do note that these slides were originally made for a lecture at my university):

Download (PDF, Unknown)

Dynamic AABB Tree

After studying Box2D, Bullet and some slides an alumnus (thanks to Nathan Carlson!) from DigiPen created, I put together a dynamic AABB (axis aligned bounding box) tree broadphase. A Dynamic AABB Tree is a binary search tree for spatial partitioning. Special thanks to Nathanael Presson for the original authoring of the tree within the Bullet engine source code (see comments of this post).

A broadphase in a physics simulation has the job of reporting when bodies are potentially colliding. Usually this is done through cheap intersection tests between simple bounding volumes (AABBs in this case).

There are some interesting properties of a dynamic AABB tree that make the data structure rather simple in terms of implementation.

There are a couple main functions to implement outlined here:

  • Insert
  • Remove
  • Update

The underlying data structure ought to be a huge array of nodes. This is much more optimal in terms of cache performance than many loose heap-allocated nodes. This is very important as the entire array of nodes is going to be fetched from memory every single time the broadphase is updated.

The node of the AABB tree can be carefully constructed to take up minimal space, as nodes are always in one of two states: branches and leaves. Since nodes are stored in an array this allows for the nodes to be referenced by integral index instead of pointer. This allows the internal array to grow or shrink as necessary without fear of leaving dangling pointers anywhere.

The idea of the tree is to allow user data to be stored only within leaf nodes. All branch nodes contain just a single AABB enveloping both of its children. This leads to a short description of invariants for the data structure:

  • All branches must have two valid children
  • Only leaves contain user data

The first rule allows for some simplification of operations like insert/remove. No additional code branching is required to check for NULL children, increases code performance.

Insert

Insertion involves creating a fat AABB to bound the userdata associated with the new leaf node. A fat AABB is just an AABB that is slightly larger than a tight fitting AABB. Once a new leaf node is created with its fat AABB a tree traversal is required in order to find a sibling to pair up with, and a new parent branch is constructed.

Traversing the tree should be done by following some sort of cost heuristic. The best seems to be a cost involving the surface area of involved nodes. See Box2D for a resource in cost heuristics.

After a new parent is created and a sibling found, the bounding volume hierarchy of all parent nodes are invalid. A traversal up the tree, correcting all the bounding AABBs and node heights is required. This hierarchy correction can be abstracted into a simple function:

Remove

Removing nodes from binary search trees can involve a stack to trace back a traversal path, however due to the few invariants of the dynamic AABB tree removal quite simple. Since all branches must contain two valid children, and only leaves contain userdata, the only nodes that require deletion are leaves.

Remove the leaf’s parent and replace it with the leaf’s sibling. After this operation the parent hierarchy needs to be recalculated (just as in insert).

Update

Since this tree is dynamic it needs a way to handle moving objects, and not just static level geometry. Since fat AABBs are created upon insertion, objects can move around a little bit before the AABB within the tree is invalid. The fatness factor, or how much to fatten each AABB, can be fine-tuned for performance. I myself use an arbitrary distance (about 5% of the average scale of objects). It is also possible to fatten the AABB based upon the object’s previous frame’s displacement (see Box2D for details on displacement predictions).

The update function of the tree checks to make sure that the current tight-fitting AABB of a shape is still contained by the AABB within the tree. In order for this operation to be constant time the node index in the tree needs to be intrusively stored within the shape itself. This will allow a shape to provide its node index along with an up to date tight fitting AABB to the tree, in order to see if its AABB is still within the fat AABB’s bounds.

If the tight AABB is not contained by the fat AABB the shape needs to be removed and reinserted into the tree.

Balancing the Tree

Since this sort of spatial partitioning involves a binary search tree, the tree will perform optimally (in terms of collision queries) so long as the tree is balanced.

However, how you balance the tree matters. I’ve heard (through rumors) that dynamic AABB trees ought to be balanced in terms of surface area and not tree height. Though this makes sense conceptually, I was not sure how to balance the tree based of surface area. Knowing this, I just went with something I’m more comfortable with, which is balancing based upon tree height.

Balancing a single node involves taking a look at its two children. The height of each child should be compared, and if one is higher than the other by a height of two or more a rotation should be performed. In other words, the child with a larger height ought to be promoted, wherein promotion is a way of rotating a node up the tree hierarchy.

This sort of balancing (AVL balancing) can be performed at the beginning of the hierarchy synchronization loop. This allows the tree to self-balance itself upon insertion and removal of leaves.

Free List

The tree needs to maintain an internal free list of unused nodes. Constructing a free list involves looping over all the nodes of the tree, upon initial tree construction, linking each node to the next with indices. This process is really simple, though readers may be unfamiliar with lists of indices as opposed to lists of pointers.

Here is a helper function I’ve constructed for myself to setup free lists (useful upon construction of the tree, and growing of the internal array):

Taking nodes from the free list involves setting the current m_freeList index to the next node in the list. Inserting nodes is just as trivial. You can think of this sort of list as a singly linked list, except the links are through indices rather than pointers.

Queries

Querying volumes against the tree is highly efficient, so long as collision checks against an AABB are fast. Sphere, AABB and raycasts are all very fast.

Querying the tree involves detecting collisions with each node in the tree, starting with the root, traversing to all children until the tree is exhausted. Traversal to a child should be performed if there is a collision with the parent.

This sort of query can easily be done with recursion (note: the callback returns bool, signifying continue or termination of the query):

However it may be more favorable to do so with an iterative algorithm. Iterative algorithms are generally a little harder to implement, but take much less memory (and thus are more efficient):

Culling Queries Further

It might be a good idea to only query rigid bodies that moved last frame. This is much better than querying all dynamic bodies unconditionally. If you have a contact graph or constraint graph, or any sort of islanding implementation then you can easily cull objects from queries. The idea is that if a rigid body is static or not placed within an island, then it didn’t move at all the previous frame. This is very simple and elegant.

Usually only active (awake) dynamic bodies are placed into an island (and any objects they have contact constraints with). This is because objects that are sleeping don’t need to be considered at all when forming new islands, as they are asleep. Knowing this a small flag can be set within a shape when it is placed into an island. This flag can later be checked to know whether or not it needs to be queried at all within the broadphase.

Though this step is not specific to the dynamic AABB tree, it is a nice trick to know. One caveat is that when contact constraints are updated, they must first have each involved shape’s bounding volume checked against each other. This will allow old contact constraints to be removed once there is no potential for the narrow phase to return true.

Conclusion

The dynamic AABB tree is an extremely fast spatial partitioning data structure, and is ideal for both large unbounded worlds, and lots of dynamic rigid bodies. Querying the tree has the time complexity of a binary search. I can’t imagine much better of a broadphase for rigid body simulation. There’s an old saying of “no broadphase is perfect, and neither is this one.” when speaking of broadphases. However the dynamic AABB tree has completely impressed me.

In terms of performance, my simulation would grind to a halt and bottleneck with an N^2 broadphase at just over 100 dynamic bodies. Now I can easily simulate around 5000 rigid bodies with the broadphase taking about 5% of the simulation time. My current bottleneck is cache misses during constraint solving.

Additional Resources

Object Linking with Static Libraries in Visual Studio

One of my favorite features of C++ is the ability for constructors to run code. This allows for the creation of some interesting objects whose sole purpose is only to run a little bit of code. Here is an example:

Using this SAY_WORD macro is quite easy. It can be placed at global scope anywhere in a cpp file, and the word it contains will indeed be said.

Important registration operations can be performed by global objects like the one above. Often times the global registrator object may never even be referenced in the program as its only purpose is to perform some registration upon program start. This is perfectly fine if you’re working in Visual Studio with a single project in your solution. The problem is when you want to start spreading code over multiple projects. Microsoft’s linker is quite over-zealous and strips away objects that don’t seem to be referenced anywhere throughout the entire solution.

A way to force linking of such objects is needed in order to ensure that global registration objects aren’t stripped by the linker. As a result of such need I came up with a compiler-specific solution that works for all my needs on Visual Studio 2010 and 2012 (probably works with earlier versions too).

The above macro can be placed into a header and it will force the compiler to link the object. In order to force linking the linker flag /INCLUDE needs to be used. However, the name of the symbol representing an object is required to be passed in. In Visual Studio a single underscore is appended to object names, and a stringizing macro can be used to generate the name of whatever global variable you’re instantiating.

To further ensure that the macro works I tried to prevent any C++ name mangling by externing the object as a C style object.

I have made prolific use of this style of macro in my custom C++ reflection, along with factory registration, along with some other things. Hopefully this trick can be of use to others as well.