# Essentials of Software Engineering – With a Game Programming Focus

I’ve been commissioned to write a big document about what I think it means, and what is necessary, to become a good software engineer! If anyone finds the document interesting please do email me (email on my resume) with any questions or comments, even if you hate it and deeply disagree with me! This is the first release so some sections may be a bit skimpy and I might have some false info in there here and there.

I’m a fairly young engineer so please do take all my opinions with a healthy dose of skepticism. Nonetheless I hope the document is useful for someone! Many of my opinions come from my interpretation of much more experienced engineers I’ve come in direct contact with, and this PDF is sort of a creation of what others taught me.

As time goes on I may make edits or add new chapters, when I do I’ll modify the version number (just above the table of contents). Here is the PDF:

Share

# Circular Linked Lists and Branching

Since linked lists are such an essential topic I’ve taken some extra care to learn efficient ways of using them. The simplest kind of linked list to conceptualize is the singly linked list. There are tons of online resources for learning the basics about linked lists, so I’ll assume readers are familiar with the concept.

Here’s a quick mock header of some linked list nodes for reference:

In general singly linked lists are more complicated to manage once removal of nodes is required. Since no explicit prev pointer is stored in memory a temporary variable is often kept on the stack while traversing a singly linked list. This means more complicated code that clogs the user’s focus.

Even though a doubly linked list requires twice the memory they are usually still preferred over singly linked lists, even when a singly linked list could get the job done without any additional time complexity. Often times linked lists are useful in complex algorithms, and if there’s a chance to simplify the implementation of a complex algorithm by using a doubly linked list, then that chance is probably worth the taking.

When I first implemented a doubly linked list and tested its performance out against std::list I couldn’t quite get it to perform well.

Naive insertion and removal of list nodes often has to check for NULL pointers, which represent the front and back of the linked list. Here’s an example of what removal might look like to give you the idea of how many if-statements could be necessary (code not tested, I just typed it out here on the spot):

There are two if statements hit every single time this function is called. When the CPU comes across a branch is loads instructions based on which path of execution it deems most likely. This is called branch prediction. If this prediction is incorrect the loaded code must be unloaded, and then the appropriate code must be re-loaded.

This branch missing probably going to be a very fast CPU operation since executing code is almost definitely in the L-1 code cache. Despite it being fast modern CPU still operate through a pipeline, and branch misses can still garble up whatever pipelining is happening. In the end a branch miss is a performance hit, and should be avoided when appropriate.

A common linked list optimization is to use a dummy head and tail node. These nodes sit in memory along with the list data structure. Upon list initialization they point their next and previous pointers to one another, and NULL out the pointers to represent the front and back of the list.

With this optimization the only case that user nodes will ever encounter is the case in the first two if statements (assuming both were true). The removal code can now look something like (again, not tested):

This is one kind of optimization the std implements. After doing this myself my list performed evenly with the std’s implementation.

## Intrusive Lists

Intrusively linked lists invert the definition of what a node is. Traditionally a linked list node contains some data. An intrusive list has the data contain the node:

This scheme is nice since now nodes do not need to be allocated separately from the data. If the number of data elements is known, then the exact number of nodes needed can also be known.

C++ templates can be used to create a generic intrusively linked list implementation, able to define nodes inside of any data type. C macros can also be used to the same effect. In this way an intrusively linked list can be used in pretty much the same way a normal linked list is.

One major downside to intrusively linked lists is that they add in extra memory to your data. This can be a big deal if some code is very performance sensitive. If cache line utilization is important, then the percentage of data actually used in each line becomes important. Sometimes these pointers get in the way and clutter the lines. This cluttering is something to be aware of.

On the flip side many algorithms can run on arrays of data. Instead of storing explicit pointers to represent prev and next connections, indices into an array can be used. This can make entire data structures memcpy-able, or serializable just by dumping bits to a stream. Additionally, the pointers stored directly within data will often be accessed as the exact same time (depending on the algorithm), which results in very high cache line utilization.

It all depends on the scenario.

## Circular Lists (Sentinel)

When dealing with intrusive linked lists it can often be really weird to define where in memory dummy nodes would reside. Are we to create dummy pieces of Data? What if the algorithm needs lists to be constantly created and destroyed? What if the algorithm can have as many lists as there are nodes? Suddenly an algorithm might need twice as many dummy nodes as actual nodes!

It is possible to remove the dummy nodes in some cases. Data elements can be initialized to point to themselves. In this way each element is itself a doubly linked list with one node. To insert a second node is a matter of making both nodes point to each other. Inserting a third node should use the exact same code as inserting the second node (and not require any branching since NULL indices/pointers do not exist), and so on.

In many cases an intrusive circular doubly linked list (boy, isn’t that a mouthful) can be the perfect solution to a hard problem! I will leave it as an exercise to research or implement this circular style of linked list.

Another name for this type of list would be a “sentinel intrusive list”, where a sentinel node can be used to bound a list traversal. Since our linked lists are circular we can start at any node, traverse the list, and once we reach the node we started upon our traversal is complete.

# Small C++ Reflection Demo

I created a small demonstration program that explains the core ideas behind implementing a custom reflection system for C++. More might be written in this post in the future — for now I’m just storing the demo right here on this webpage:

# Sane Usage of Components and Entity Systems

With some discussion going in a previous article about how to actually implement some sort of component system for a game engine, without vague theory or dogma, a need for some higher level perspective was reached, and so this article arose.

In general an aggregation model is often useful when piecing together bits of functionality or data to create something new. The ability to do so is very useful for writing game-specific gameplay code due the flexibility of code granted by aggregation. However as of late there’s been tremendous talk about OOP, Entity Systems, Inheritance, and blah blah blah within the online indie development community. More and more buzzwords get tossed around by big name writers and the audience really just looks for some guidelines to follow in hopes of writing good code.

Sadly there isn’t going to be a set of step by step rules for writing a game engine or coming up with a good architecture. Like many of said before me, writing a game is a specific task requiring specific solutions. Why do you think game engine developers such as Epic or the Unity guys have so many people working on the product? Because a generic game engine is a huge piece of software that requires a lot of features. Some features exist simply to let users add in custom features easily.

Components, aggregation, Entity Component Systems, Entity systems, these are just words and have various definitions (depending on who you ask).

## Some Definitions

To hopefully avoid silly arguments and confusion lets define some terms. If you don’t like the definitions here feel free to express so, I’m all up for criticism and debate.

• Component Based Architecture
• A preference for aggregation over inheritance. Is just a concept and does not lead to a single specific implementation. A game object is a collection of components. A component defines data and/or functionality for a concept.
• Entity Component System (ECS)
• A specific implementation of Component Based Architecture. A game object would be an ID (an integer). The ID is used to form an aggregate. Usually an ECS implies an implementation similar to a database, where components are entries into a database that are looked up through some identifier. The main goals of this implementation are efficiency and simplicity. Often times the term “ECS” is used just to describe a Component Based Architecture, often leading to confusion.
• Aggregation
• I like to think of this as a “has-a” relationship over an “is-a” relationship. Aggregation refers to one object “having” another object, which implies an aggregate is a collection (data structure) of other objects.

## Some Truth and History

Aggregation is useful from a game design perspective. It frees functionality from arbitrary classification (classes and inheritance). Classes were originally created in C++ to let a programmer tie together a piece of data and some functionality to represent some sort of real-life concept. This is in simplest terms the essence of Object Oriented Programming (OOP). Over time more features were added to help engineer relationships between classes, one such feature came in the form of inheritance.

There’s nothing inherently wrong with OOP and it makes sense in a lot of code. Problems can arise when there’s a mis-application of OOP that has implications that aren’t fully understood at the time of implementation that cause negative affects down the road. I’m sure we’ve all seen the code migration and mega-class example so commonly thrown around in articles arguing against OOP and inheritance abuse.

In response to such an abuse a new paradigm became popularized which focused on aggregation of functionality to form an object. This might be called a “component based architecture”. In general aggregation can be considered an appropriate alternative to inheritance.

## OOP Diatribe

Usually when an article spews forth caustic attacks against OOP it’s directed at naive implementations that disregard implications of how memory is accessed. Perhaps in the past the bottleneck of most everything was processor speed, so a lot of literature focuses on this. Nowadays CPUs on the PC have an architecture that have ridiculous computational power with extremely limited memory access. In general one might consider accessing memory from RAM 300 times slower than multiplying two floats together. Of course this last statement is extremely anecdotal without any evidence, but exists just to give a rough perspective of reality in many current (2014) cases.

If objects with associated code (classes) are just allocated and deallocated on the heap at will then a performance bottleneck of memory access is going to rear its ugly face, likely long before other performance issues are even on the radar. This is where much of the diatribe comes from.

It should be noted that pretty much all code bases that make use of the C++ language use classes and structures in some form or another. As long as a programmer has an understanding of memory, how it’s accessed, and what implications arise from given implementations, nothing will go wrong. Alas, actually doing these things and writing good code is super hard. It doesn’t matter if a class has some implementation code within it, so long as that bit of code makes sense for the purposes it is serving.

## Implementing Components, a First Draft

The most immediate implementation would be to make use of multiple inheritance. This has a clear definition of where the data goes, and it all goes in one class -the derived class. Multiple inheritance itself can get a bit tricky when dealing with pointer typecasting between derived and base types, though the C++ language itself handles the details much of the time.

Inheritance alone doesn’t provide a good mechanism to query whether a base class is apart of a specific derived aggregation and so the dynamic cast operator is born. Since the dynamic cast is a branching operation, usually implemented (afaik) by inspecting the vtable, it is avoided in general.

Multiple inheritance also does all sorts of work to member function pointers, and is just a sad part of C++. Additionally there isn’t any language feature that allows for dynamic dispatch for combinations of base classes, so if the need arises a custom solution will need to be implemented anyway.

Memory accessing, although defined, isn’t ideal. Multiple inheritance forms a blob of different data, and usually only a single piece of the blob is needed at any given time, meaning locality of reference will be poor in general. This leads to the idea of inheriting from multiple interfaces in order to decouple memory aggregation from functionality aggregation, which leads to the next draft.

## Second Draft – Run-Time Aggregation

Instead of using multiple inheritance on interfaces, which is a compile-time feature, run-time support can be added. Object aggregates can be formed during run-time, and modified thereafter. This is appealing for data driven applications, and game-design friendly development iteration speed.

So lets assume that some programmer wants to implement components, but doesn’t think much about memory access patterns the implications therein. Using a vector of pointers an implementation of components becomes super simple. Each pointer can point to an interface exposing a few functions like Update, Init and Shutdown.

Searching for a particular component is as simple as linearly looping over each pointer until a matching type is found. If these pointers are ordered in some way a search can be performed, perhaps a binary search could suffice. If the identifier of a component is hashable a hash table lookup can be used.

The implementation so far is an excellent one except that there is no definition of how memory is allocated and accessed! In the most naive of implementation each game object and each component will be allocated on the heap with separate calls to malloc.

Despite having no clear memory definition there are some nice benefits that have arisen. Data driving the composition of an aggregate becomes quite trivial as each component of an aggregation can have an entirely isolated lifetime. Adding, removing, modifying, or even creating new components at run-time are all now possibilities. This dynamic aggregate architecture is great for improving game development and design iteration time!

## Aggregation and Components and the Entity System Paradigm (ES/ECS)

As stated in the definitions section, an ECS is just a specific implementation of a component based architecture. A component based architecture game engine architecture would be a custom implementation of multiple inheritance. A clearly defined ECS can impose restrictions on how a component architecture is implemented and used in hopes of avoided poor memory access patterns, or in hopes of keeping code simple and orderly.

If a component is designed as a piece of memory without any code, and a game object defined as an integer ID then performance specifications can be easily imposed. Rules about where in memory components lay, and how components are actually accessed can be clearly defined in simple terms. Code can be written that operates upon arrays of components, transforming arrays linearly. This idea is actually a type of Data Oriented Design (DOD), which makes sense as DOD is just an idea! ECS is an application of the idea of DOD.

So with this type of implementation the benefits of dynamic composition can be paired with well-defined memory layout and access patterns. Suddenly prefetching and parallelism become much simpler to support.

## Aggregatize all the Things!

There’s a problem. Blindly shoving the idea of an ECS implementation into every nook and cranny of an engine during development is just silly (or any complex system, not just game engines or libraries). Often times a particular system is not best implemented with a component or aggregate paradigm in mind.

An obvious case is that of a physics engine. Often times a physics engine developer is worried about collision detection, solving systems of linear equations, rigid body mechanics and allowing the engine to easily be integrated into existing code bases. These details involve a lot of math and good API design. A developer of a physics engine is going to have their focus employed in full force in solving problems specific to physics engines. This means that the engineer’s focus is finite, so the implementation that is best is one that the engineer can actually bring to completion. An implementation that can come to completion is one that makes sense for the specific details of whatever is going on inside the physics engine. The specific paradigms used are often not aggregation or component based!

In order for a physics engine to run fast it needs to have efficient memory access patterns and memory usage, on modern PC hardware, requires some form of DOD. Since this complex (often black boxed) physics engine will have it’s own specific implementation and optimization it doesn’t make sense to force a component based model to its very core with some sort of idealistic zeal. It gets really bad when strict rules are imposed (like banning all code from classes and structures that define components) on the component model (like with an ECS) and the rules start permeating the deep recesses of the entire code base.

The same thing goes for any sort of complex system. The core facilities of a game engine often times just don’t really care about components or aggregation. This means that an engine architecture that implements components will usually have to deal with middleware graphics/physics engines/libraries that don’t subscribe to a component based model (simply because it’s easier to use a library than to write your own custom things, especially if those custom things religiously follow some silly methodology like ECS or even OOP). In practice light wrapper components can be created to let the functionality of such systems be presented in a component format, ready to be used in an aggregate object.

## What does this all mean? What should we all do?

Use components where it makes sense in code. Use inheritance where it makes sense in code. Use databases where they make sense. Use all the things where they should. This is a pretty sad answer but it’s the right one. There is no silver bullet paradigm that solves all the problems in the game engine architecture world, and there are no steps to follow to achieve a result that works in all cases. Specific problems require specific solutions. Good code is hard to write, and will require a lot of judgement calls. In order to make good judgement calls a lot of experience and perspective is required.

I recommend using aggregation where it really matters. Dynamic aggregation is important for gameplay specific code. Gameplay specific code, in this article, would refer to code that would not easily apply or work at all in a different game. It’s code that is your game and doesn’t define an isolated system or functionality.

Dynamic aggregation and the component based model are extremely important for game and object editors. Game design flourishes best when iteration times are driven to zero, and the ability to create new things from a composition of fundamentals is very valuable! Clearly composition is useful, but how it’s to be used is the hard part.

## What Components to Make?

I recommend making components concerned with providing access to game-independent functionality to be quite large. Every 3D game engine has a concept of a mesh, and will usually have some sort of file format to associate with, like FBX. Every 2D game engine will have the concept of a sprite. Each game using Box2D will have colliders and rigid bodies, and possibly joints. These fundamental pieces of functionality don’t change very often, so static compile-time relationships aren’t a bad thing since iteration time isn’t really all that relevant.

A 3D game might have a single Mesh component for example. A Mesh component can have renderable vertices, and possibly all the skeletal and animation information as well. There may be a single Rigid Body component, which encapsulates the idea of colliders or shapes, as well as the functionality of rigid body mechanics. The Rigid Body component might even contain all necessary code and data to hold multiple joints! Or joints may be a component themselves.

For high level and gameplay related features components can become much more granular (or not if you so choose). Gameplay should be iterated, tested and changed frequently, so having small and decomposed components will probably make a lot of sense in a lot of cases. Large components that encompass more broad ideas will be useful in many cases too. Even in the gameplay world judgement calls are essential.

Usually efficiency isn’t so important for much gameplay code, so any implementation that is decently performant will suffice. Scripting languages, dynamic memory allocation and virtual dispatch, or what have you can all work. The decisions of what requires flexibility, what requires performance and all between can be difficult to make. Please see the references section for some concrete examples.

We live in a world of opinions and it takes time to sift through them! If you have recommendations please comment below :)

## Reference Source Code

The best reference I know of is an open source game engine in progress (stalled until I graduate) I myself am developing. Please do send me your recommendations on references!

# C++ Enumeration Reflection

Welcome to the third post in a series of blog posts about how to implement a custom game engine in C++. As reference I’ll be using my own open source game engine SEL. Please refer to its source code for implementation details not covered in this article. Files of interest are EnumData.h, Enum.cpp and Enum.h.

## Crazy Viking Studios

Lets thank the Crazy Viking Studios guys for their generous contribution in knowledge on this topic! One day as a student I emailed them about their enumeration editing in their awesome editor for Volgarr the Viking. They responded with a bunch of source code in a demo! The techniques here have been learned from Taron their programmer.

## Introduction

Enumerations in C++ are a pretty nice feature. They provide type safety and a very readable way to name a lot of various types of constants. However there could be so much more added on top of enumerations in C++ to make extremely useful.

Lets take a trip through our imagination and imagine a game editor. In this editor you can create arbitrary constants with a name and associated integral value. This would be great for some sort of scripting or game logic.

This here can be implemented in C++ (during coding time as a compile-time constant) through enumerations. However there are some features that can be added to this to allow an editor to manipulate things:

• Modify existing entries
• Delete entries

## Basic Enumeration Editing

In order for an editor to manipulate this information within a C or C++ file some run-time memory is required to store a representation of the actual enumerations in code. Data tables (structs) will work well for this. Lets imagine a structure to contain one of these enumerations; we’ll need string representations of all of the enumeration entries:

If an Enum instance were created to contain identical string representations of the entries with the Spells enum, then a constant-time conversion of enumeration to string could be achieved just by indexing the Enum vector with a value. 

Converting a string back to an enumeration would best be done with a small hash table. This will keep string to enum conversions const-time.

### Much to be Desired

This is all fine and good, however if a user creates a new entry as a string this won’t update the actual C++ enumeration entries -new entries only exist until the editor shuts off. Additionally there isn’t an easy way to lookup a particular Enum struct. It would be nice to be able to lookup an Enum struct in various ways, such as by string name or template type. It would also be cool to be able to serialize enumerations to/from file.

It might be fairly simple to actually modify the source code containing a particular enumeration in C++ whenever an entry is modified, deleted or added. This would let programmers actually use enumerations created in an editor within their code (after a recompile). It is also possible to hookup the new entries to be loaded in the Enum struct as a string literal.

## Automation

As you can imagine a lot of manual labor is going to be needed in order to upkeep all of this crazy editing and modifying of enumerations. Some generalization and automation is needed to keep dev-work at an absolute minimum.

This is the time when I reference an old project I created to demonstrate a simple idea for serialization in C. The trick is use a source file and include it multiple times with various macro definitions. The source file to be included fills out the macros, but the macros are interpreted differently depending on when it was included. This allows you to write data files and interpreters using the preprocessor.

This is exactly what we need for building up some automated reflection and editing of enumerations.

Imagine a data file like so (a header without multiple inclusion guards and some macro invocations):

Lets take this data file and create a normal enumeration:

As you can see the macros from the data file are going to interpret the data as an enumeration. It is important to just always #undef all the macros in case they were previously defined.

After the preprocessor runs and the macros expand we will end up with something like:

Now the key part comes with defining the macros again to interpret the data in an all new way. Here’s an example to automate the creation of the Enum struct containing string literals:

The idea here is to construct an array of const char * literals and pass them to the Enum struct’s constructor. The struct can loop over them until the sentinel NULL value is found. When expanded by the preprocessor this file might look like:

While the Enum struct is looping over the literals passed to it in the constructor, it can also be adding the strings to a hash table to lookup appropriate indices.

## Editor Support

Now that a great scheme for automation of generating the actual enumeration data is setup, all that is required is to make sure that an editor can easily find an appropriate C++ file to modify when entries are modified. My solution was just to cram all enumerations into a single C++ file. This C++ is detailed with a nice comment saying something like: WARNING: This file is auto-generated by the Enum Editor.

This actually works pretty well but has a single drawback: editing an enumeration causes a global recompile of the project. There are no separate namespaces or naming schemes in my own implementation, meaning that each enumeration has to be unique to avoid compilation errors.

From here it’s just a matter of writing to your C++ data file.

## Tree Heirarchy

Wouldn’t it be great to be able to say “This enum is a subset of this entry”? That might have sounded confusing, here’s an example:

The idea is to allow each enumeration entry to contain an enumeration by creating a tree hierarchy.

This would be great for all sorts of game logic or general organization! It’s also possible to implement a really fast IsA function, so you could go if(type->IsA( Dragon )). Implementing this would just be a matter of traversing the tree hierarchy.

## Enumeration Features

I implemented a bunch of rag-tag features in my game engine SEL and would like to cover a couple of the more useful ones. Just take a quick look at an example declaration of the Enum struct:

I’m sure most readers can imagine how these methods are useful and how to implement them.

However looking up a specific enumeration by name (useful for macros) or by template type is something that is a little harder to implement. Please see SEL for a working reference on how to accomplish these. The idea is to use the multiple-inclusion trick on the data file to define some template specializations.

## Serialization and Introspection Registration

Serializing enumerations should be really straightforward for both binary and string formats. For binary the numerical representations can be utilized. And string format uses the string arrays constructed at compile-time.

The rest is just a matter of writing some string to/from file routines.

Some introspection techniques rely on the user to register various types within the reflection system. In this case it turns out this registration can also be automated with multiple-file inclusion on the data file! Just define a routine to register each enumeration type. There’s not much to it!

## Conclusion

I certainly hope this helps someone out there! Please do comment or ask questions right here on the post, I always enjoy reading them.

# Slides: Intro to Engine Dev with Component Based Design

I just found some slides I made a few months ago. They cover the basics of component based engine design and how to take advantage of simple serialization to data drive the construction of objects. Hopefully these slides can be of use to someone:

# fscanf Power

I recently had to read in a text file in order to draw some geometry on screen. Take a quick look at the text file:

Click to Show SelectShow

Lets assume for some reason we need to parse this text file and grab the float values from within. This isn’t a very difficult thing to do, it’s really just a matter of how much code I want to write. The best way I could think of doing this with C++ file i/o was to write something like:

Now whether or not this code is good or bad doesn’t really matter so much. The point is, I had to write an annoying amount of code just to grab the first float, and no matter the C++ technique used I don’t think code will be any less annoying to write. In the end all I really want is to read in these floats and just quickly parse a text file with a similar format whenever needed. Some hardcoding is to be expected, but I didn’t want to have to take the time writing code to jump through file offsets by burning through extraction assignments or manipulating the file pointer by hand. C++ file i/o is quite verbose.

I decided to look into some fscanf details to find something simpler that takes less code to write. Here’s what I learned from various parts of the internet:

Knowing this small amount of information is actually all that is needed in order to easily grab each float from the text file we need to parse. Lets take a look at the negation flag in conjunction with the negation scanset specifier:

This single line of code will read all text within our file until it comes across an equal sign. The equal sign will be left within the file. This small bit of fscanf knowledge is all that is needed in order to grab floats from our text file! Actually, one could take the automated serialization written about previously and have each member name be preceded by an equal sign. This equal sign character will make for writing a deserialization routine very simple! One could just call fscanf and immediately jump right before a value to grab.

The only issue is that the equal sign is still left within our file. A slight modification to the fscanf call can fix this:

The preceding equal sign will then read in an equal sign from the file. Now all that is left is a single call to fscanf to read in the float. Optionally you can place everything into one call like so:

And that’s it; a single line of fscanf code is all that’s needed to read over all text within our file in order to grab a float. The negation flag along with the negation scanset operator are awesome for writing file i/o operations, and the best part is that it’s more flexible than anything I can think of doing in C++ without writing tremendous amounts of code. I’ll likely be using C file i/o from now on whenever I can due to how non-verbose it is compared to C++ i/o.