Category Archives: Design

Essentials of Software Engineering – With a Game Programming Focus

I’ve been commissioned to write a big document about what I think it means, and what is necessary, to become a good software engineer! If anyone finds the document interesting please do email me (email on my resume) with any questions or comments, even if you hate it and deeply disagree with me! This is the first release so some sections may be a bit skimpy and I might have some false info in there here and there.

I’m a fairly young engineer so please do take all my opinions with a healthy dose of skepticism. Nonetheless I hope the document is useful for someone! Many of my opinions come from my interpretation of much more experienced engineers I’ve come in direct contact with, and this PDF is sort of a creation of what others taught me.

As time goes on I may make edits or add new chapters, when I do I’ll modify the version number (just above the table of contents). Here is the PDF:

Download (PDF, Unknown)

TwitterRedditFacebookShare

Preprocessed Strings for Asset IDs

Mick West posted up on his site a really good overview of some different methods for hashing string ids and gave good motivation for optimizing this area early on in a project. Please do review his article as it’s a prerequisite to this post, and his article is just really good.

I’ve been primarily concerned with memory management of strings as they are extremely hairy to work with. For me specifically I’ve ruled out the option of a string class — they hide important details and it’s too easy to write poor (but functional) code with them. This is just my opinion. Based on that opinion I’d like to achieve these points:

  • Avoid all dynamic memory allocation, or de-allocation
  • Avoid complicated string data structures
  • Little to no run-time string traversals (like strcmp)
  • No annoying APIs that clutter the thought-space while writing/reading code
  • Allow assets to refer to one another while on-disk or in-memory without complicated pointer translations

There’s a lot I wanted to avoid here, and for good reason. If code makes use of dynamic memory, complicated data strctures, etc. that code is likely to suck in terms of both performance and maintenance. I’d like less features, less code, and some specific features to solve my specific problem: strings suck. The solution is to not use strings whenever possible, and when forced to use strings hide them under the rug. Following Jason Gregory’s example he outlined about Naughty Dog’s code base from his book “Game Engine Architecture 2nd Ed” I implemented the following solution, best shown via gif:

anim

The SID (string id) is a macro that looks like (along with a typedef):

I’ve implemented a preprocessor in C that takes an input file, finds the SID macro instances, reads the string, hashes it and then inserts the hash along with the string stored as comment. Pretty much exactly what Mick talked about in his article.

Preprocessing files is fairly easy, though supporting this function might be pretty difficult, especially if there are a lot of source files that need to be pre-processed. Modifying build steps can be risky and sink a ton of time. This is another reason to hammer in this sort of optimization early on in a project in order to reap the benefits for a longer period of time, and not have to adjust heavy laid-in-stone systems after the fact.

Bundle Away the Woes

I’ve “stolen” Mitton’s bundle.pl program for use in my current project to recursively grab source files and create a single unified cpp file. This large cpp can then be fed to the compiler and compiled as a “unity build”. Since the bundle script looks for instances of the “#include” directive code can be written in almost the exact same manner as normal C++ development. Just make CPP files, include headers, and don’t worry about it.

The only real gotcha is if someone tries to do fancy inclusions by defining macros outside of files that affect the file inclusion. Since the bundle script is only looking for the #include directive (by the way it also comments out unnecessary code inclusions in the output bundled code) and isn’t running a full-blown C preprocessor, this can sometimes cause confusion.

It seems like a large relief on the linker and leaves me to thinking that C/C++ really ought to be used as single-translation unit languages, while leaving the linker mostly for hooking together separate libraries/code bases…

Compiling code can now look more or less like this:

First collect all source into a single CPP, then preprocess the hash macros, and finally send the rest off to the compiler. Compile times should shrink, and I’ve even caught wind that modern compilers have an easier time with certain optimizations when fed only a single file (rumors! I can’t confirm this myself, at least not for a while).

Sweep it Under the Rug

Once some sort of string id is implemented in-game strings themselves don’t really need to be used all too often from a programmer’s perspective. However for visualization, tools, and editors strings are essential.

One good option I’ve adopted is to place strings for these purposes into global table in designated debug memory. This table can then be turned off or compiled away whenever the product is released. The idea I’ve adopted is to allow tools and debug visualization to use strings fairly liberally, albeit they are stored inside the debug table. The game code itself, along with the assets, refer to identifiers in hash-form. This allows product code to perform tiny translations from fully hashed values to asset indices, which is much faster and easier to manage compared to strings.

This can even be taken a step further; if all tools and debug visualizations are turned off and all that remains is a bunch of integer hash IDs, assets can then be “locked” for release. All hashed values can be translated directly into asset IDs such that no run-time translation is ever needed. For me specifically I haven’t quite thought how to implement such a system, and decided this level of optimization does not really give me a significant benefit.

Parting Thoughts

There are a couple of downsides to doing this style of compile-time preprocessing:

  • Additional complexity in the build-step
  • Layer of code opacity via SID macro

Some benefits:

  • Huge optimization in terms of memory usage and CPU efficiency
  • Can run switch statement on SID strings
  • Uniquely identify assets in-code and on-disk without costly or complicated translation

If the costs can be mitigated through implementing some kind of code pre-processor/bundler early on then it’s possible to be left with just a bunch of benefits :)

Finally, I thought it was super cool how hashes like djb2 and FNV-1a use an initializer value to start the hashing, typically a carefully chosen prime. This allows to hash a prefix string, and then feed the result off to hash the suffix. Mick explains this in his article this idea of combining hashed values as a useful feature for supporting tools and assets. This can be implemented both at compile or run-time (though I haven’t quite thought of a need to do this at compile-time yet):

 

C++ Keyword inline and .inl Files

While at the bar a group of friends jokingly mocked some of the more silly features of C++. The initial banter consisted of how the STL implemented everything including the kitchen sink, though forgot to implement std::girlfriend.

Wouldn’t std::girlfriend be great? We can plug in any type of girlfriend we want into the template parameters and the compiler will just generate one for us! Why in the world would std::girlfriend be omit from STL?

Oh of course, std::girlfriend was never implemented because everyone is just going to put in way too many specific template types (super hot, not crazy) and it’ll just end in a bunch of “failed to specialize template” error messages. And then the moment too many of the template parameters are removed we’ll just get a bunch of “multiple symbols defined” linker errors! Maybe it was a good idea to never implement std::girlfriend in the first place. After all, a girlfriend prefixed with std might make one thing of something other than C++…

Jokes aside I brought up the fact that inline is totally useless for inlining. The only real reason to use the inline keyword (in my opinion) is to able to define functions within a header. Well, I brought it up as a joke, but not really a joke, and that’s the joke.

The inline keyword and .inl files can actually be a pretty nice organizational tool for code, and I’ve found it helps users that didn’t write the implementation understand the code.

Say we are implementing some kind of algorithm that stores elements in an array. Elements need to refer to one another (perhaps to build intrusive linked lists), although these arrays ought to be relocated in memory without requiring any complex copy routines; a single memcpy should yield a new and valid copy.

One way to do so is to make use of array indices instead of pointers. Usually a myriad of small helper functions will arise to clean up all of the array indexing that usually ensues shortly after this kind of code crops up. It’s a huge pain to look into a .cpp and have to continually navigate passed a lot of tiny and trivial helper functions just to understand the algorithm.

These small helpers can be swept to the side into a .inl file. The .inl file signature immediately tells the user what kind of code resides within (either templates or inlined functions), and usually this kind of code isn’t very necessary to understand the more heavy duty code within the .cpp file.

Here’s a mock example:

Aren’t these example files pretty easy going to read? I’m sure you at least scanned the .inl file briefly, and will probably never really need to look at it again. Time will be well spent in the .cpp file with less code to clog your brain. And who knows, maybe the compiler (or perhaps the linker) actually cares a little bit when we type the inline keyword.

Sane Usage of Components and Entity Systems

With some discussion going in a previous article about how to actually implement some sort of component system for a game engine, without vague theory or dogma, a need for some higher level perspective was reached, and so this article arose.

In general an aggregation model is often useful when piecing together bits of functionality or data to create something new. The ability to do so is very useful for writing game-specific gameplay code due the flexibility of code granted by aggregation. However as of late there’s been tremendous talk about OOP, Entity Systems, Inheritance, and blah blah blah within the online indie development community. More and more buzzwords get tossed around by big name writers and the audience really just looks for some guidelines to follow in hopes of writing good code.

Sadly there isn’t going to be a set of step by step rules for writing a game engine or coming up with a good architecture. Like many of said before me, writing a game is a specific task requiring specific solutions. Why do you think game engine developers such as Epic or the Unity guys have so many people working on the product? Because a generic game engine is a huge piece of software that requires a lot of features. Some features exist simply to let users add in custom features easily.

Components, aggregation, Entity Component Systems, Entity systems, these are just words and have various definitions (depending on who you ask).

Some Definitions

To hopefully avoid silly arguments and confusion lets define some terms. If you don’t like the definitions here feel free to express so, I’m all up for criticism and debate.

    • Component Based Architecture
      • A preference for aggregation over inheritance. Is just a concept and does not lead to a single specific implementation. A game object is a collection of components. A component defines data and/or functionality for a concept.
    • Entity Component System (ECS)
      • A specific implementation of Component Based Architecture. A game object would be an ID (an integer). The ID is used to form an aggregate. Usually an ECS implies an implementation similar to a database, where components are entries into a database that are looked up through some identifier. The main goals of this implementation are efficiency and simplicity. Often times the term “ECS” is used just to describe a Component Based Architecture, often leading to confusion.
    • Aggregation
      • I like to think of this as a “has-a” relationship over an “is-a” relationship. Aggregation refers to one object “having” another object, which implies an aggregate is a collection (data structure) of other objects.

Some Truth and History

Aggregation is useful from a game design perspective. It frees functionality from arbitrary classification (classes and inheritance). Classes were originally created in C++ to let a programmer tie together a piece of data and some functionality to represent some sort of real-life concept. This is in simplest terms the essence of Object Oriented Programming (OOP). Over time more features were added to help engineer relationships between classes, one such feature came in the form of inheritance.

There’s nothing inherently wrong with OOP and it makes sense in a lot of code. Problems can arise when there’s a mis-application of OOP that has implications that aren’t fully understood at the time of implementation that cause negative affects down the road. I’m sure we’ve all seen the code migration and mega-class example so commonly thrown around in articles arguing against OOP and inheritance abuse.

In response to such an abuse a new paradigm became popularized which focused on aggregation of functionality to form an object. This might be called a “component based architecture”. In general aggregation can be considered an appropriate alternative to inheritance.

OOP Diatribe

Usually when an article spews forth caustic attacks against OOP it’s directed at naive implementations that disregard implications of how memory is accessed. Perhaps in the past the bottleneck of most everything was processor speed, so a lot of literature focuses on this. Nowadays CPUs on the PC have an architecture that have ridiculous computational power with extremely limited memory access. In general one might consider accessing memory from RAM 300 times slower than multiplying two floats together. Of course this last statement is extremely anecdotal without any evidence, but exists just to give a rough perspective of reality in many current (2014) cases.

If objects with associated code (classes) are just allocated and deallocated on the heap at will then a performance bottleneck of memory access is going to rear its ugly face, likely long before other performance issues are even on the radar. This is where much of the diatribe comes from.

It should be noted that pretty much all code bases that make use of the C++ language use classes and structures in some form or another. As long as a programmer has an understanding of memory, how it’s accessed, and what implications arise from given implementations, nothing will go wrong. Alas, actually doing these things and writing good code is super hard. It doesn’t matter if a class has some implementation code within it, so long as that bit of code makes sense for the purposes it is serving.

Implementing Components, a First Draft

The most immediate implementation would be to make use of multiple inheritance. This has a clear definition of where the data goes, and it all goes in one class -the derived class. Multiple inheritance itself can get a bit tricky when dealing with pointer typecasting between derived and base types, though the C++ language itself handles the details much of the time.

Inheritance alone doesn’t provide a good mechanism to query whether a base class is apart of a specific derived aggregation and so the dynamic cast operator is born. Since the dynamic cast is a branching operation, usually implemented (afaik) by inspecting the vtable, it is avoided in general.

Multiple inheritance also does all sorts of work to member function pointers, and is just a sad part of C++. Additionally there isn’t any language feature that allows for dynamic dispatch for combinations of base classes, so if the need arises a custom solution will need to be implemented anyway.

Memory accessing, although defined, isn’t ideal. Multiple inheritance forms a blob of different data, and usually only a single piece of the blob is needed at any given time, meaning locality of reference will be poor in general. This leads to the idea of inheriting from multiple interfaces in order to decouple memory aggregation from functionality aggregation, which leads to the next draft.

Second Draft – Run-Time Aggregation

Instead of using multiple inheritance on interfaces, which is a compile-time feature, run-time support can be added. Object aggregates can be formed during run-time, and modified thereafter. This is appealing for data driven applications, and game-design friendly development iteration speed.

So lets assume that some programmer wants to implement components, but doesn’t think much about memory access patterns the implications therein. Using a vector of pointers an implementation of components becomes super simple. Each pointer can point to an interface exposing a few functions like Update, Init and Shutdown.

Searching for a particular component is as simple as linearly looping over each pointer until a matching type is found. If these pointers are ordered in some way a search can be performed, perhaps a binary search could suffice. If the identifier of a component is hashable a hash table lookup can be used.

The implementation so far is an excellent one except that there is no definition of how memory is allocated and accessed! In the most naive of implementation each game object and each component will be allocated on the heap with separate calls to malloc.

Despite having no clear memory definition there are some nice benefits that have arisen. Data driving the composition of an aggregate becomes quite trivial as each component of an aggregation can have an entirely isolated lifetime. Adding, removing, modifying, or even creating new components at run-time are all now possibilities. This dynamic aggregate architecture is great for improving game development and design iteration time!

Aggregation and Components and the Entity System Paradigm (ES/ECS)

As stated in the definitions section, an ECS is just a specific implementation of a component based architecture. A component based architecture game engine architecture would be a custom implementation of multiple inheritance. A clearly defined ECS can impose restrictions on how a component architecture is implemented and used in hopes of avoided poor memory access patterns, or in hopes of keeping code simple and orderly.

If a component is designed as a piece of memory without any code, and a game object defined as an integer ID then performance specifications can be easily imposed. Rules about where in memory components lay, and how components are actually accessed can be clearly defined in simple terms. Code can be written that operates upon arrays of components, transforming arrays linearly. This idea is actually a type of Data Oriented Design (DOD), which makes sense as DOD is just an idea! ECS is an application of the idea of DOD.

So with this type of implementation the benefits of dynamic composition can be paired with well-defined memory layout and access patterns. Suddenly prefetching and parallelism become much simpler to support.

Aggregatize all the Things!

There’s a problem. Blindly shoving the idea of an ECS implementation into every nook and cranny of an engine during development is just silly (or any complex system, not just game engines or libraries). Often times a particular system is not best implemented with a component or aggregate paradigm in mind.

An obvious case is that of a physics engine. Often times a physics engine developer is worried about collision detection, solving systems of linear equations, rigid body mechanics and allowing the engine to easily be integrated into existing code bases. These details involve a lot of math and good API design. A developer of a physics engine is going to have their focus employed in full force in solving problems specific to physics engines. This means that the engineer’s focus is finite, so the implementation that is best is one that the engineer can actually bring to completion. An implementation that can come to completion is one that makes sense for the specific details of whatever is going on inside the physics engine. The specific paradigms used are often not aggregation or component based!

In order for a physics engine to run fast it needs to have efficient memory access patterns and memory usage, on modern PC hardware, requires some form of DOD. Since this complex (often black boxed) physics engine will have it’s own specific implementation and optimization it doesn’t make sense to force a component based model to its very core with some sort of idealistic zeal. It gets really bad when strict rules are imposed (like banning all code from classes and structures that define components) on the component model (like with an ECS) and the rules start permeating the deep recesses of the entire code base.

The same thing goes for any sort of complex system. The core facilities of a game engine often times just don’t really care about components or aggregation. This means that an engine architecture that implements components will usually have to deal with middleware graphics/physics engines/libraries that don’t subscribe to a component based model (simply because it’s easier to use a library than to write your own custom things, especially if those custom things religiously follow some silly methodology like ECS or even OOP). In practice light wrapper components can be created to let the functionality of such systems be presented in a component format, ready to be used in an aggregate object.

What does this all mean? What should we all do?

Use components where it makes sense in code. Use inheritance where it makes sense in code. Use databases where they make sense. Use all the things where they should. This is a pretty sad answer but it’s the right one. There is no silver bullet paradigm that solves all the problems in the game engine architecture world, and there are no steps to follow to achieve a result that works in all cases. Specific problems require specific solutions. Good code is hard to write, and will require a lot of judgement calls. In order to make good judgement calls a lot of experience and perspective is required.

I recommend using aggregation where it really matters. Dynamic aggregation is important for gameplay specific code. Gameplay specific code, in this article, would refer to code that would not easily apply or work at all in a different game. It’s code that is your game and doesn’t define an isolated system or functionality.

Dynamic aggregation and the component based model are extremely important for game and object editors. Game design flourishes best when iteration times are driven to zero, and the ability to create new things from a composition of fundamentals is very valuable! Clearly composition is useful, but how it’s to be used is the hard part.

What Components to Make?

I recommend making components concerned with providing access to game-independent functionality to be quite large. Every 3D game engine has a concept of a mesh, and will usually have some sort of file format to associate with, like FBX. Every 2D game engine will have the concept of a sprite. Each game using Box2D will have colliders and rigid bodies, and possibly joints. These fundamental pieces of functionality don’t change very often, so static compile-time relationships aren’t a bad thing since iteration time isn’t really all that relevant.

A 3D game might have a single Mesh component for example. A Mesh component can have renderable vertices, and possibly all the skeletal and animation information as well. There may be a single Rigid Body component, which encapsulates the idea of colliders or shapes, as well as the functionality of rigid body mechanics. The Rigid Body component might even contain all necessary code and data to hold multiple joints! Or joints may be a component themselves.

For high level and gameplay related features components can become much more granular (or not if you so choose). Gameplay should be iterated, tested and changed frequently, so having small and decomposed components will probably make a lot of sense in a lot of cases. Large components that encompass more broad ideas will be useful in many cases too. Even in the gameplay world judgement calls are essential.

Usually efficiency isn’t so important for much gameplay code, so any implementation that is decently performant will suffice. Scripting languages, dynamic memory allocation and virtual dispatch, or what have you can all work. The decisions of what requires flexibility, what requires performance and all between can be difficult to make. Please see the references section for some concrete examples.

Further Readings

We live in a world of opinions and it takes time to sift through them! If you have recommendations please comment below :)

Reference Source Code

The best reference I know of is an open source game engine in progress (stalled until I graduate) I myself am developing. Please do send me your recommendations on references!

Protip: How to be a Physics Dev – Debug Rendering

Today I figured out the single most important tip I could ever give anyone in regards to being a competent physics engine developer. I will share this tip right here on my humble blog:

Your skills as a physics developer will grow at a rate proportional to how good your debug draw information is. I’d actually argue that your skills as a physics developer are proportional to how good your debug draw is.

I feel I’m at a decent place in my skills, as I feel my debug draw has become decent. Lets get some screenshots going!

Initial simplex during Quick Hull.

Initial simplex during Quick Hull.

Stack of polyhedron (OBB meshes). The blue and red contacts are warm started (red is warm started consistently over many frames).

Stack of polyhedron (OBB meshes). The blue and red contacts are warm started (red is warm started consistently over many frames).

Spent a lot of time creating Maya-like camera controls with orbiting.

Spent a lot of time creating Maya-like camera controls with orbiting.

I digress. I didn’t figure out this simple fact on my own; I gleaned this wisdom off of my friend Nathan Carlson. He pointed out that the reason a lot of students at DigiPen have trouble with computational geometry and general physics engine development due to a lack of debug rendering.

Just take a look at Dirk Gregorius’s sample demo from his GDC 2013 lecture. He fully renders Gauss maps, a minkoswki difference, and has glorius camera controls. I actually copied his camera design and implemented it myself. You can see that he has Ant Tweakbar integrated as well, for various tweaking tasks. Erin Catto has lots and lots of different tests, asserts, and most important of all tons of debug drawing for his Box2D library. The Zero team on the third floor of DigiPen definitely has the craziest debug draw I’ve ever heard of!

Hopefully I can achieve something at least in the likeness of all these other developers, and I recommend readers do the same; intuition is often best built visually when it comes to geometry.

Component Based Engine Design

What is Component Based Design?

Component based engine design was originally pioneered in order to avoid annoying class hierarchies that inheritance introduces. The idea is to package all functionality of game objects into separate objects. A single game object is just a collection of the components, as so the components of a game object define it’s behavior, appearance and functionality. This is perfectly fine, though there are plenty of resources out that talk about this topic. However I’d like to take a step back and start from the top.

It should be noted that the implementation presented here is just one way of going about things, and comes directly from my highly subjective opinion. Perhaps you as a reader can come up with or know of better solutions or designs.

Here is an example game object, note that the game object is generic and simply contains some components:


The Actual Engine

The engine of a game can be thought of as a manager of systems. As to what a system is, we’ll get to that later, for now think of a system as either Physics, Graphics or AI. The engine ought to have a main loop function, as well as an update function. The update function calls update on all contained systems in a specific order. The main loop function is just a small infinite loop that calls update.

Often times the main loop will deal with timestepping itself. Have a look at the linked article to learn more about proper timestepping.

It is important to have your engine expose the update function, as sometimes your engine will need to be compiled as a static library and linked to externally. In this case the main loop of your simulation may reside outside of the engine library altogether. A common usage I’ve seen for this sort of design choice is when creating an editor of some sort, perhaps a level or content editor. Often times these editors will have a veiwport to preview the game, and in order to do so access to some sort of engine update function is necessary.

Beyond containing and calling update, the Engine also forwards global messages to all the systems. More on messaging later.


Singletons?

Creating more than one instance of an engine should never happen. The question of “should I make this a singleton” will sometimes arise. In my experience the answer is often no. Unless you’re working on a large team of programmers where the chances of some moron making an instance of some Engine or System class, making things a singleton is just a waste of time. Especially if retrieving data from that singleton incurs a little bit of overhead.


Systems

Each system in the engine corresponds to one type of functionality. This idea is best shown by example, so here are the various systems I usually have in engines I work on:

  • Graphics
  • Physics
  • GameLogic
  • Windowing/Input
  • UI
  • Audio
  • ObjectFactory – Creates objects and components from string or integral ID

Each system’s primary functionality is to operate upon game objects. You can think of a system as a transform function: data is input, modified somehow, and then data is output. The data passed to each system should be a list of game objects (or of components). However a system only updates components on game objects, and the components to be updated are the ones related to that system. For example the graphics system would only update sprite or graphics related components.

Here’s an example header file for a system:

The naive approach to a system update would be to pass a list of game objects like so:

The above code is assuming the ObjectFactory contains all game objects, and can be accessed somehow (perhaps by pointer). This code will work, and it’s exactly what I started with when I wrote my first engine. However you’ll soon realize the folly involved here.


Cache is King

That’s right, those who have the gold makes the rules; the golden rule. In a more serious sense, cache is king due to processing speed related to memory access speed. The bottleneck in all engines I have ever seen or touched in the past couple years has been due to poor memory access patterns. Not a single serious bottleneck was due computation.

Currently modern hardware performs very very fast. Reaching for data in memory (RAM, not just hard disk) is orders of magnitude slower. So caches come the rescue. A cache can be thought of, in a simplified sense, as a small chunk of memory right next to the CPU (or GPU). Accessing the cache memory is way faster than going all the way out to main RAM. Whenever memory is fetched from RAM memory around that RAM location is also placed into the CPU cache. The idea here is that when you retrieve something from RAM the likelyhood of requiring to fetch something very nearby is high, so all the data in that area is grabbed all at once.

Long story short, if you place things that need to be accessed at around the same time next to each other in memory, huge performance benefits will be reaped. The best performance comes from traversing memory linearly, as if iterating over an array. This means that if we can stick things into arrays and traverse these arrays linearly, there will be no faster form of memory access.

Fetching memory that does not exist in the cache is called a cache miss.


Cache and Components

Lets revisit the naive approach to updating systems. Assuming a system is handed a generic game object, that system must then retrieve its corresponding component(s) to update, like so:

As this loop is run the memory of every game object and every component type that corresponds to the system is touched. A cache miss will likely be incurred over and over as the loop runs bouncing around in memory. This is even worse if the ObjectFactory is just creating random objects with new, as every memory access to every object and every component will likely incur cache misses.

What do? The solution to all these memory access problems is to simplify the data into arrays.


Arrays GameObjects + Components

I suggest having every game object exist within a single giant array. This array should probably be contained within the ObjectFactory. Usage of std::vector for such a task is recommended. This keeps game objects together in memory, and even though deletion of a game object is of O(n) complexity, that O(n) operation traverses an array, and usually will turn out to be unnoticeable. A custom vector or array class can be created that avoids the O(n) operation entirely by taking the element at the end, and placing it into the deleted slot. This can only be done if references into the array are translated handles (more on handles momentarily).

Every component type should be in a giant array too. Each component array should be stored within their respective systems (but can be “created” from the Factory). Again, an array like data structure would be ideal.

This simple setup allows for linear traversal of most memory in the entire engine, so long as the update function of each system is redesigned slightly. Instead of handing a list of game objects to each system, the system can just iterate over its related components directly, since the components are stored within the systems already.


So, how are Game Objects “handled” now?

Since components have been moved into large arrays, and the game objects themselves are in a big array, what exactly should the relationship between a game object and a component be? In the naive implementation some sort of map would have worked perfectly fine, as the memory location of each component could be anywhere due to the use of new calls. However the relation isn’t so carefree.

Since things are stored in arrays its time to switch from pointer-centric relationships to handle based relationships. A handle can be thought of in its simplest form an index into an array. Since game objects and components are stored in arrays, it is only natural that to access a game object you do so by index. This allows for these giant arrays to grow and shrink as necessary without obliterating dangling pointers in the rest of the program.

Here’s a code example:

As you can see, an array of handles is stored to represent the containment of components. There is one slot in the array for each type of component. By design this limits each component to be of unique type within a game object. Each handle is an index into a large array of components. This index is used to retrieve components that correspond to a game object. A special value (perhaps -1, or by some other mechanism) can be used to denote “no component”.

Handles can get quite a bit more versatile than just a plain ol’ integer. I myself created a HandleManager for translating an integer into a pointer. Here’s a great resource for creating your own handle manager.

The idea of translating a handle into a pointer is such that once the pointer is used it is not kept around. Just let it be reclaimed back into the stack. This makes it so that every time a pointer is required there is a translation of handle to a single pointer somewhere in memory. This constant translation allows for the actual pointer value to be translated to, to be swapped for another pointer at any time without fear of leaving dangling pointers.


Where does the Code go?

Code for update routines can be put into either systems or components. The choice is yours entirely. A more data oriented approach would put as much code into systems as possible, and just use components as buckets of data. I myself prefer this approach. However once you hit game logic components virtual functionality is likely to be desired, and so code will likely be attached directly to such components.

The last engine I built used the naive approach to component based design, and it worked wonderfully. I used a block allocator so cache misses weren’t as high as with raw new calls.

The point is, do what makes most sense and keep things simple. If you want to store routines within your components and use virtual function calls, then you’ll probably have trouble storing things in an array unless you place all memory in the base class. If you can externalize as much code from your components as possible, it may be simpler to keep all your components in an array.

There is a tradeoff between flexibility and efficiency. My personal preference is to store performance sensitive components in huge arrays, and keep AI and game logic related things together in memory as much as possible, but not really stress too much about it. Game logic and AI should probably just be rather flexible, and so details about memory locations aren’t so important. One might just allocate these types of components with a block allocator and call it good.


Messaging

The last major devil in an engine is messaging. Messaging is transferring data from one location to another. In this sense the most basic form of messaging is a simple function call.

Taking this a step further, we’d like to be able to send messages over a connection in a generic manner. It should not matter what type of message we send; all messages should be sent the same way to reduce code duplication and complexity. The most basic form of this is dynamic dispatch, or virtual function calls. An ID is introduced to the message so the internal data can be typecasted to the correct type.

Still, we can do better. Lets imagine we have a sort of GameLogic component or system. We need a way to send a message to this object that contains some data. Lets not focus much on memory access patterns, as simplicity and flexibility are key here. Take a look at this code:

This code highlights the usage of messaging quite well. Say the player emits some messages as it walks around, perhaps something like “I’m here” to all nearby things in a level. The player can blindly send these messages across the SendMessage function without caring about whether or not it will respond or do anything. As you can see, the implementation of the SendMessage function ignores most message types and responds to a few.

In this example when the player nears the treasure chest it will glimmer a bit. Perhaps when the player gets closer it glimmers brighter and brighter. In order to do so, the MSG object sent to the treasure chest ought to contain the player coordinates, and so it can be typecasted to the appropriate message type.

The eActivate message may be emitted by the player whenever they hit the “e” button. Anything that could possibly respond to an eActive message will do so, and the rest of the objects receiving the message will safely ignore it.

This type of messaging is simple and easy to implement, quite efficient (if a block allocator or stack memory is used for the messages), and rather powerful.

A more advanced version of messaging makes heavy use of C++ introspection. A future article will likely be devoted to this topic, as it’s a hefty topic altogether. Edit: Here’s a link to a slideshow I presented at my university.


Conclusion and Resources

This was a rather whirlwind tour through a lot of different information -I hope it all came out coherent. Please don’t hesitate to ask any questions or add comments; I may even just update this post with more information or clarifications!

Resources:

Game Program Structuring/Design: Game State Manager

Not too long ago I created my own first game from scratch in pure C. I struggled most with program design. I’d like to share what I’ve learned about a proper game state manager.

A game state manager is a method of creating a highly organized main game loop, which is generalized to the point that it can apply to any situation without pre-runtime modification of code. The game should always be in a state during runtime. That state could be the main menu, a pause screen, or a specific level or scenario. A state is divided up into multiple categories, and these categories are generalized enough to apply to every state the game can be in.

The idea is to use function pointers, and a specific set of them. There are six different functions to know:
  • Load
  • Initialize
  • Update
  • Draw
  • Free
  • Unload
These are the six functions that will make up your main loop, and once constructed your main loop should need little to no modification throughout the development of your game. Each function represents a category of functionality during a state. Every state consists of these categories. Each different state will have six functions designed for each of the function pointers in the main loop to point to them. This way your main loop will simply point to different states with its six pointers whenever you want to switch from one state to another.

The load function takes care of loading all of a state’s necessary data. Load should be called only once per state -not even if the state restarts. Load also initializes the loaded data. This function is called first when a state starts.

The initialize function prepares the state’s data to be used initially. It should not load any data, only prepare it. This allows a fast restart of a state in the event a restart is required -no loading or unloading will be involved in a state restart.

The update function uses a change in time (dt) to hand off to necessary functions to update the game in realtime. dt is defined as the time elapsed since the last call to update. Input should be gathered once per update call before calculations are made. All gameplay logic should happen in this state, and all live objects should be updated here.

The draw function renders all required images onto the screen, and additionally plays any desired sound effects. A well organized program will send data off to a graphics manager in this state, allowing further decoupling of major system and logic components.

The free function is what frees any objects or data no longer required, and sets the state up for switching or restarting. No data is unloaded (image sources, sound sources, meshes, etc). The idea is to set everything up to be initialized cleanly again.

The unload function is called during state termination, and unloads all data loaded in the load state. Here is an example of a properly set up game flow of a main loop: 

Initialize system components
GSM_Initialize( firstState )
while not quitting
  if currentState is quitting
    nextState is Quit
  if currentState is restart
    currentState is previousState
    nextState is previousState
  else
    GSM_Update( )
    Load( )
  Initialize( )
  while currentState is nextState
    Update
      gather input
      get dt
      update game logic
    Draw( )
  Free( )
  if nextState is Restart
    previousState is currentState
    currentState is nextState
  else
    Unload( )
    previousState is currentState
    currentState is nextState
Unload

By analyzing the setup of the above game flow you should be able to see how it works. To change a state, you simply modify the global variables currentState and nextState. previousState is then kept on-hand automatically. GSM_Update is responsible for updating the function pointers Load, Initialize, Update, Draw, Free and Unload whenever a state is started. In the event the global variable currentState changes, these function pointers will then change to their appropriate values via a switch statement. This switch statement lies within the GSM_Update function. The switch runs on the value of currentState, and once it finds a match it assigns the function pointers in the main loop to the appropriate matching state. Here is an example of a GSM_Update function:

// Update the Game State Manager by syncing the three state indicators to their
// corresponding function pointers (all six of them).
int GSM_Update( void )
{
  switch(currentState)
  {
  case Level_1:
    Load = &Level1_Load;
    Initialize = &Level1_Initialize;
    Update = &Level1_Update;
    Draw = &Level1_Draw;
    Free = &Level1_Free;
    Unload = &Level1_Unload;
    break;
  case Level_2:
    Load = &Level2_Load;
    Initialize = &Level2_Initialize;
    Update = &Level2_Update;
    Draw = &Level2_Draw;
    Free = &Level2_Free;
    Unload = &Level2_Unload;
    break;
  case MapEditor:
    Load = &MapEditor_Load;
    Initialize = &MapEditor_Initialize;
    Update = &MapEditor_Update;
    Draw = &MapEditor_Draw;
    Free = &MapEditor_Free;
    Unload = &MapEditor_Unload;
    break;
  case Presentation:
    Load = &PresentationLoad;
    Initialize = &PresentationInitialize;
    Update = &PresentationUpdate;
    Draw = &PresentationDraw;
    Free = &PresentationFree;
    Unload = &PresentationUnload;
    break;
  /*case Template:
    Load = &Load;
    Initialize = &Initialize;
    Update = &Update;
    Draw = &Draw;
    Free = &Free;
    Unload = &Unload;
    break;*/
  case Restart:
    break;
  case Quit:
    break;
  }
  return RETURN_SUCCESS;
}

And there you have it; a proper organizational setup to allow an excellent method of managing game states. Using this sort of organization allows for each state to have a universal format which allows for modification of states, additions of states, and deletions of states during development to be very easy and time-efficient.