Monthly Archives: May 2013

Component Based Engine Design

What is Component Based Design?

Component based engine design was originally pioneered in order to avoid annoying class hierarchies that inheritance introduces. The idea is to package all functionality of game objects into separate objects. A single game object is just a collection of the components, as so the components of a game object define it’s behavior, appearance and functionality. This is perfectly fine, though there are plenty of resources out that talk about this topic. However I’d like to take a step back and start from the top.

It should be noted that the implementation presented here is just one way of going about things, and comes directly from my highly subjective opinion. Perhaps you as a reader can come up with or know of better solutions or designs.

Here is an example game object, note that the game object is generic and simply contains some components:

The Actual Engine

The engine of a game can be thought of as a manager of systems. As to what a system is, we’ll get to that later, for now think of a system as either Physics, Graphics or AI. The engine ought to have a main loop function, as well as an update function. The update function calls update on all contained systems in a specific order. The main loop function is just a small infinite loop that calls update.

Often times the main loop will deal with timestepping itself. Have a look at the linked article to learn more about proper timestepping.

It is important to have your engine expose the update function, as sometimes your engine will need to be compiled as a static library and linked to externally. In this case the main loop of your simulation may reside outside of the engine library altogether. A common usage I’ve seen for this sort of design choice is when creating an editor of some sort, perhaps a level or content editor. Often times these editors will have a veiwport to preview the game, and in order to do so access to some sort of engine update function is necessary.

Beyond containing and calling update, the Engine also forwards global messages to all the systems. More on messaging later.


Creating more than one instance of an engine should never happen. The question of “should I make this a singleton” will sometimes arise. In my experience the answer is often no. Unless you’re working on a large team of programmers where the chances of some moron making an instance of some Engine or System class, making things a singleton is just a waste of time. Especially if retrieving data from that singleton incurs a little bit of overhead.


Each system in the engine corresponds to one type of functionality. This idea is best shown by example, so here are the various systems I usually have in engines I work on:

  • Graphics
  • Physics
  • GameLogic
  • Windowing/Input
  • UI
  • Audio
  • ObjectFactory – Creates objects and components from string or integral ID

Each system’s primary functionality is to operate upon game objects. You can think of a system as a transform function: data is input, modified somehow, and then data is output. The data passed to each system should be a list of game objects (or of components). However a system only updates components on game objects, and the components to be updated are the ones related to that system. For example the graphics system would only update sprite or graphics related components.

Here’s an example header file for a system:

The naive approach to a system update would be to pass a list of game objects like so:

The above code is assuming the ObjectFactory contains all game objects, and can be accessed somehow (perhaps by pointer). This code will work, and it’s exactly what I started with when I wrote my first engine. However you’ll soon realize the folly involved here.

Cache is King

That’s right, those who have the gold makes the rules; the golden rule. In a more serious sense, cache is king due to processing speed related to memory access speed. The bottleneck in all engines I have ever seen or touched in the past couple years has been due to poor memory access patterns. Not a single serious bottleneck was due computation.

Currently modern hardware performs very very fast. Reaching for data in memory (RAM, not just hard disk) is orders of magnitude slower. So caches come the rescue. A cache can be thought of, in a simplified sense, as a small chunk of memory right next to the CPU (or GPU). Accessing the cache memory is way faster than going all the way out to main RAM. Whenever memory is fetched from RAM memory around that RAM location is also placed into the CPU cache. The idea here is that when you retrieve something from RAM the likelyhood of requiring to fetch something very nearby is high, so all the data in that area is grabbed all at once.

Long story short, if you place things that need to be accessed at around the same time next to each other in memory, huge performance benefits will be reaped. The best performance comes from traversing memory linearly, as if iterating over an array. This means that if we can stick things into arrays and traverse these arrays linearly, there will be no faster form of memory access.

Fetching memory that does not exist in the cache is called a cache miss.

Cache and Components

Lets revisit the naive approach to updating systems. Assuming a system is handed a generic game object, that system must then retrieve its corresponding component(s) to update, like so:

As this loop is run the memory of every game object and every component type that corresponds to the system is touched. A cache miss will likely be incurred over and over as the loop runs bouncing around in memory. This is even worse if the ObjectFactory is just creating random objects with new, as every memory access to every object and every component will likely incur cache misses.

What do? The solution to all these memory access problems is to simplify the data into arrays.

Arrays GameObjects + Components

I suggest having every game object exist within a single giant array. This array should probably be contained within the ObjectFactory. Usage of std::vector for such a task is recommended. This keeps game objects together in memory, and even though deletion of a game object is of O(n) complexity, that O(n) operation traverses an array, and usually will turn out to be unnoticeable. A custom vector or array class can be created that avoids the O(n) operation entirely by taking the element at the end, and placing it into the deleted slot. This can only be done if references into the array are translated handles (more on handles momentarily).

Every component type should be in a giant array too. Each component array should be stored within their respective systems (but can be “created” from the Factory). Again, an array like data structure would be ideal.

This simple setup allows for linear traversal of most memory in the entire engine, so long as the update function of each system is redesigned slightly. Instead of handing a list of game objects to each system, the system can just iterate over its related components directly, since the components are stored within the systems already.

So, how are Game Objects “handled” now?

Since components have been moved into large arrays, and the game objects themselves are in a big array, what exactly should the relationship between a game object and a component be? In the naive implementation some sort of map would have worked perfectly fine, as the memory location of each component could be anywhere due to the use of new calls. However the relation isn’t so carefree.

Since things are stored in arrays its time to switch from pointer-centric relationships to handle based relationships. A handle can be thought of in its simplest form an index into an array. Since game objects and components are stored in arrays, it is only natural that to access a game object you do so by index. This allows for these giant arrays to grow and shrink as necessary without obliterating dangling pointers in the rest of the program.

Here’s a code example:

As you can see, an array of handles is stored to represent the containment of components. There is one slot in the array for each type of component. By design this limits each component to be of unique type within a game object. Each handle is an index into a large array of components. This index is used to retrieve components that correspond to a game object. A special value (perhaps -1, or by some other mechanism) can be used to denote “no component”.

Handles can get quite a bit more versatile than just a plain ol’ integer. I myself created a HandleManager for translating an integer into a pointer. Here’s a great resource for creating your own handle manager.

The idea of translating a handle into a pointer is such that once the pointer is used it is not kept around. Just let it be reclaimed back into the stack. This makes it so that every time a pointer is required there is a translation of handle to a single pointer somewhere in memory. This constant translation allows for the actual pointer value to be translated to, to be swapped for another pointer at any time without fear of leaving dangling pointers.

Where does the Code go?

Code for update routines can be put into either systems or components. The choice is yours entirely. A more data oriented approach would put as much code into systems as possible, and just use components as buckets of data. I myself prefer this approach. However once you hit game logic components virtual functionality is likely to be desired, and so code will likely be attached directly to such components.

The last engine I built used the naive approach to component based design, and it worked wonderfully. I used a block allocator so cache misses weren’t as high as with raw new calls.

The point is, do what makes most sense and keep things simple. If you want to store routines within your components and use virtual function calls, then you’ll probably have trouble storing things in an array unless you place all memory in the base class. If you can externalize as much code from your components as possible, it may be simpler to keep all your components in an array.

There is a tradeoff between flexibility and efficiency. My personal preference is to store performance sensitive components in huge arrays, and keep AI and game logic related things together in memory as much as possible, but not really stress too much about it. Game logic and AI should probably just be rather flexible, and so details about memory locations aren’t so important. One might just allocate these types of components with a block allocator and call it good.


The last major devil in an engine is messaging. Messaging is transferring data from one location to another. In this sense the most basic form of messaging is a simple function call.

Taking this a step further, we’d like to be able to send messages over a connection in a generic manner. It should not matter what type of message we send; all messages should be sent the same way to reduce code duplication and complexity. The most basic form of this is dynamic dispatch, or virtual function calls. An ID is introduced to the message so the internal data can be typecasted to the correct type.

Still, we can do better. Lets imagine we have a sort of GameLogic component or system. We need a way to send a message to this object that contains some data. Lets not focus much on memory access patterns, as simplicity and flexibility are key here. Take a look at this code:

This code highlights the usage of messaging quite well. Say the player emits some messages as it walks around, perhaps something like “I’m here” to all nearby things in a level. The player can blindly send these messages across the SendMessage function without caring about whether or not it will respond or do anything. As you can see, the implementation of the SendMessage function ignores most message types and responds to a few.

In this example when the player nears the treasure chest it will glimmer a bit. Perhaps when the player gets closer it glimmers brighter and brighter. In order to do so, the MSG object sent to the treasure chest ought to contain the player coordinates, and so it can be typecasted to the appropriate message type.

The eActivate message may be emitted by the player whenever they hit the “e” button. Anything that could possibly respond to an eActive message will do so, and the rest of the objects receiving the message will safely ignore it.

This type of messaging is simple and easy to implement, quite efficient (if a block allocator or stack memory is used for the messages), and rather powerful.

A more advanced version of messaging makes heavy use of C++ introspection. A future article will likely be devoted to this topic, as it’s a hefty topic altogether. Edit: Here’s a link to a slideshow I presented at my university.

Conclusion and Resources

This was a rather whirlwind tour through a lot of different information -I hope it all came out coherent. Please don’t hesitate to ask any questions or add comments; I may even just update this post with more information or clarifications!



Object Linking with Static Libraries in Visual Studio

One of my favorite features of C++ is the ability for constructors to run code. This allows for the creation of some interesting objects whose sole purpose is only to run a little bit of code. Here is an example:

Using this SAY_WORD macro is quite easy. It can be placed at global scope anywhere in a cpp file, and the word it contains will indeed be said.

Important registration operations can be performed by global objects like the one above. Often times the global registrator object may never even be referenced in the program as its only purpose is to perform some registration upon program start. This is perfectly fine if you’re working in Visual Studio with a single project in your solution. The problem is when you want to start spreading code over multiple projects. Microsoft’s linker is quite over-zealous and strips away objects that don’t seem to be referenced anywhere throughout the entire solution.

A way to force linking of such objects is needed in order to ensure that global registration objects aren’t stripped by the linker. As a result of such need I came up with a compiler-specific solution that works for all my needs on Visual Studio 2010 and 2012 (probably works with earlier versions too).

The above macro can be placed into a header and it will force the compiler to link the object. In order to force linking the linker flag /INCLUDE needs to be used. However, the name of the symbol representing an object is required to be passed in. In Visual Studio a single underscore is appended to object names, and a stringizing macro can be used to generate the name of whatever global variable you’re instantiating.

To further ensure that the macro works I tried to prevent any C++ name mangling by externing the object as a C style object.

I have made prolific use of this style of macro in my custom C++ reflection, along with factory registration, along with some other things. Hopefully this trick can be of use to others as well.

Data Oriented Design : Starting Up

Personal Update

It has been some time since I have tended to my personal blog, so I feel like I’m coming back to visit my neglected pet as a young child. That sounds sort of strange, but I do feel a little bit of shame in procrastinating for so long! I have so many things to do; tech demos, update resume, refactor the Portfolio section of the website, write some more posts, write posts for GameDev.TutsPlus. So many things. Anyway, it feels good to have the time in the Summer semester to do all these things.

Data Oriented Design

Data oriented design is a newer way of thinking that seemed to have come forth primarily from the processing power of modern hardware contrasted with abysmally low memory access speeds. Both an average CPU and GPU are ridiculously powerful, and memory access patterns are the cause of all performance bottlenecks I have encountered in all my programming endeavors thus far.

A natural tendency of a game engine during development is for over-complication to arise. Over complicating a solution to a problem is a tell-tale sign of a bad designer. So in architecting a game engine simplicity is something to strive for. However simplicity alone needs to be coupled with performance and brevity (nobody likes thousands and thousands of lines of simple code, either). The ideal game engine is one that is highly efficient and simple. Out of these needs Data Oriented Design has arisen.

Benefits of Simplicity

Sure, a simple program can easily be read. It can also easily be modified and extended as a result of being read. This is a no brainer for most of us, however what may not be immediately realized is that due to the ability to read and modify simple code leads to something interesting. If one manages to solve a complex problem in a simple way, it can be acceptable for that solution to be highly specific or specialized.

A generic solution to any problem would of course be best, but often times coming up with a generic solution takes too much time to be realistic. Even so large amounts of time are often justified towards creating generic solutions within complex systems. If you’re going to add code to a complex system it will inherently take time, and reusability becomes a necessity.

In this case a more specific solution that can be reused less would suffice. However if such a specific solution is incredibly simple, efficient and has brevity, then it suddenly becomes less a big deal if the solution is not so generic. On top of this, if this simple solution is also self-contained has no side effects it can be parallelized trivially.

Benefits of DOD

As my various links describe, DOD provides: simple, optimized and trivially parallelized code. If this is true, then DOD is something that every pogrammer should consider employing whenever optimization is a necessity.

Back to Data Oriented Design

So what is Data Oriented Design (DOD)? There are not very many solid articles on actually implementing DOD, but I can do a quick link roll of what I have collected so far:

I myself am currently writing large portions of a 3D game engine, and in doing so I want to make use of DOD in some key areas, namely physics, graphics, and game objects/components. So how would one actually use DOD in these areas?

I like to think of DOD like this: Keep all of your data in large contiguous arrays within memory. Then, whenever an operation needs to be performed on some of the data, think in terms of performing this operation on each piece of data sequentially, all in one traversal. Hardware accesses memory fastest when it is already in the CPU cache, so traversing memory linearly allows for the best cache performance, especially when prefetches are used (more on prefetches later).

Contrast such memory access with a traditional Object Oriented approach: an object updates each of its components, and each component has memory stored in entirely different areas. Therefor each game object, upon update, touches memory in physics, graphics, and ai related areas. The AI related area also likely touches back into (many times) the physics memory. All of this memory jumping and cache missing is incurred many times per object, and each object does this. It would be much more efficient if each segment of memory was updated and operated upon one at a time, rather than the abstract idea of a “game object” to be updated one at a time.

This last paragraph highlights the entire contrast between DOD and OOP. OOP focuses on identifying data with an abstract name, and associates functionality with that name. DOD focuses on identifying what memory can be laid contiguous, and what operations can be done on the entire st of data in a linear fashion. So in reality implementing DOD is merely a matter of changing how you think when you write code.

I decided to store all my objects and components within a simple std::vector-like container called a SlotAllocator. I have no idea what the actual name for such a data structure is though. Growing the internal array requires memory copies from one location to another, so the ability to copy objects from one memory location to another should not break your program. Using array indices instead of pointers is a great way to implement such an idea.

Since DOD is all about transforming data sets from one state to another in as linear as possible a fashion, some specialized data structures are going to be needed.

DOD Data Structures


I have already mentioned my SlotAllocator, which is pretty much a std::vector with some slight modifications (or in other words it is an array). The biggest difference is that whenever an object is deleted, instead of copying every subsequent object one index backwards, I take the rightmost object and swap it into the free’d space. This is a simple operation that keeps all the active objects in the container contiguous in memory.

I will be trying this container out to store game objects and components. I do not currently have enough of a working engine all together in order to do meaningful performance tests, but that time will eventually come.


Another structure I ran into a need of is a hash table. I have two types of hash tables: intrusive and non-intrusive. Lets start with the intrusive table.

This here is the best intrusive hash table I have ever seen. It was written by Patrick Wyatt, and is awesome. The reason an intrusive hash table is so great is that the node used to contain whatever data to be stored in the table, is stored within the data itself. This allows for the data to be stored in somewhere else in the program (in an array), and the node itself is right there as well. A more traditional hash table that uses chaining resolution will have to allocate nodes upon collision. In an intrusive scheme the node will already exist within the object. I created my own intrusive hash table in the image of Patrick Wyatt’s.

However there is sometimes the need to hash something that cannot contain a node, for example pointers. If a need for hashing pointers arises, either a non-intrusive table must be utilized or a wrapping object around both a node and the pointer must be created and inserted into an intrusive table. As such a standard non-intrusive table is required.

Making use of the aformentioned SlotAllocator I fashioned a simple table with chaining resolution. However the nodes themselves are stored in a contiguous array (due to the nature of the SlotAllocator), and instead of pointers I used array indices. Traversing the internal chains is a matter of jumping around with array access. This allows the table to grow and shrink to accommodate different load factors as desired.

Does such a table actually have performance benefits over slightly different styles? I’m sure the contiguous memory is a great plus, but actual profiling can’t be done just quite yet.


The last thing I would like to discuss is the handle manager. Perfect for managing anything in pointer form that needs to be created and destroyed regularly, which is just about everything active in a game. The handle manager translates a 32 bit integer (handle) into a pointer. Handles can be passed around the engine freely, and when a pointer is required the handle is translated into the desired pointer. The translated pointer is then used and immediately thrown away and not stored. This continual translation from handle to pointer allows for the actual pointer address within the manager to be changed, and all subsequent translations will retrieve the updated address.

Implementation is a matter of creating a bit field within the 32 bytes. A good explanation and example implementation can be found on Noel Llopis’s blog here. I myself created my own version of his handle manager with some tweaks and the like.

Moving Forward

Now that the preliminary data structures have been created it is time for me to really start using them within our 3D game engine. Future posts will likely contain all the juicy details of such ventures! Wish me luck.