# memfile.h – Reading files from memory with fscanf

Often times it is extremely convenient to compile certain assets, or data, straight into C code. This can be nice when creating code for someone else to use. For example in the tigr graphics library by Richard Mitton various shaders and font files are directly included as character arrays into the C source. Another example is in dear imgui where some font files (and probably other things) are embedded straight into the source.

The reason this is useful is embedding things in source removes dependencies from code. The less dependencies the better. The easier it is for someone to grab your code and solve problems, the better your code is.

I’ve created a single header called memfile.h that implements a function for opening a file in memory, and also implements a bunch of fscanf overloads. Here is a link to the github repository.

memfile is a C header for opening files in memory by mimicking the functionaliy of fopen and fscanf. memfile is not compatible with standard C FILE pointers, but works pretty much the same way. Here’s a quick example:

Just include memfile.h and you’re good to go. I’ve created an example to demonstrate usage in main.c. As extra goodies I’ve included the incbin script I used to generate the poem symbol from main.c (incbin.pl stolen from Richard Mitton).

Here’s example usage:

Share

# Essentials of Software Engineering – With a Game Programming Focus

I’ve been commissioned to write a big document about what I think it means, and what is necessary, to become a good software engineer! If anyone finds the document interesting please do email me (email on my resume) with any questions or comments, even if you hate it and deeply disagree with me! This is the first release so some sections may be a bit skimpy and I might have some false info in there here and there.

I’m a fairly young engineer so please do take all my opinions with a healthy dose of skepticism. Nonetheless I hope the document is useful for someone! Many of my opinions come from my interpretation of much more experienced engineers I’ve come in direct contact with, and this PDF is sort of a creation of what others taught me.

As time goes on I may make edits or add new chapters, when I do I’ll modify the version number (just above the table of contents). Here is the PDF:

# Preprocessed Strings for Asset IDs

Mick West posted up on his site a really good overview of some different methods for hashing string ids and gave good motivation for optimizing this area early on in a project. Please do review his article as it’s a prerequisite to this post, and his article is just really good.

I’ve been primarily concerned with memory management of strings as they are extremely hairy to work with. For me specifically I’ve ruled out the option of a string class — they hide important details and it’s too easy to write poor (but functional) code with them. This is just my opinion. Based on that opinion I’d like to achieve these points:

• Avoid all dynamic memory allocation, or de-allocation
• Avoid complicated string data structures
• Little to no run-time string traversals (like strcmp)
• No annoying APIs that clutter the thought-space while writing/reading code
• Allow assets to refer to one another while on-disk or in-memory without complicated pointer translations

There’s a lot I wanted to avoid here, and for good reason. If code makes use of dynamic memory, complicated data strctures, etc. that code is likely to suck in terms of both performance and maintenance. I’d like less features, less code, and some specific features to solve my specific problem: strings suck. The solution is to not use strings whenever possible, and when forced to use strings hide them under the rug. Following Jason Gregory’s example he outlined about Naughty Dog’s code base from his book “Game Engine Architecture 2nd Ed” I implemented the following solution, best shown via gif:

The SID (string id) is a macro that looks like (along with a typedef):

I’ve implemented a preprocessor in C that takes an input file, finds the SID macro instances, reads the string, hashes it and then inserts the hash along with the string stored as comment. Pretty much exactly what Mick talked about in his article.

Preprocessing files is fairly easy, though supporting this function might be pretty difficult, especially if there are a lot of source files that need to be pre-processed. Modifying build steps can be risky and sink a ton of time. This is another reason to hammer in this sort of optimization early on in a project in order to reap the benefits for a longer period of time, and not have to adjust heavy laid-in-stone systems after the fact.

## Bundle Away the Woes

I’ve “stolen” Mitton’s bundle.pl program for use in my current project to recursively grab source files and create a single unified cpp file. This large cpp can then be fed to the compiler and compiled as a “unity build”. Since the bundle script looks for instances of the “#include” directive code can be written in almost the exact same manner as normal C++ development. Just make CPP files, include headers, and don’t worry about it.

The only real gotcha is if someone tries to do fancy inclusions by defining macros outside of files that affect the file inclusion. Since the bundle script is only looking for the #include directive (by the way it also comments out unnecessary code inclusions in the output bundled code) and isn’t running a full-blown C preprocessor, this can sometimes cause confusion.

It seems like a large relief on the linker and leaves me to thinking that C/C++ really ought to be used as single-translation unit languages, while leaving the linker mostly for hooking together separate libraries/code bases…

Compiling code can now look more or less like this:

First collect all source into a single CPP, then preprocess the hash macros, and finally send the rest off to the compiler. Compile times should shrink, and I’ve even caught wind that modern compilers have an easier time with certain optimizations when fed only a single file (rumors! I can’t confirm this myself, at least not for a while).

## Sweep it Under the Rug

Once some sort of string id is implemented in-game strings themselves don’t really need to be used all too often from a programmer’s perspective. However for visualization, tools, and editors strings are essential.

One good option I’ve adopted is to place strings for these purposes into global table in designated debug memory. This table can then be turned off or compiled away whenever the product is released. The idea I’ve adopted is to allow tools and debug visualization to use strings fairly liberally, albeit they are stored inside the debug table. The game code itself, along with the assets, refer to identifiers in hash-form. This allows product code to perform tiny translations from fully hashed values to asset indices, which is much faster and easier to manage compared to strings.

This can even be taken a step further; if all tools and debug visualizations are turned off and all that remains is a bunch of integer hash IDs, assets can then be “locked” for release. All hashed values can be translated directly into asset IDs such that no run-time translation is ever needed. For me specifically I haven’t quite thought how to implement such a system, and decided this level of optimization does not really give me a significant benefit.

## Parting Thoughts

There are a couple of downsides to doing this style of compile-time preprocessing:

• Additional complexity in the build-step
• Layer of code opacity via SID macro

Some benefits:

• Huge optimization in terms of memory usage and CPU efficiency
• Can run switch statement on SID strings
• Uniquely identify assets in-code and on-disk without costly or complicated translation

If the costs can be mitigated through implementing some kind of code pre-processor/bundler early on then it’s possible to be left with just a bunch of benefits :)

Finally, I thought it was super cool how hashes like djb2 and FNV-1a use an initializer value to start the hashing, typically a carefully chosen prime. This allows to hash a prefix string, and then feed the result off to hash the suffix. Mick explains this in his article this idea of combining hashed values as a useful feature for supporting tools and assets. This can be implemented both at compile or run-time (though I haven’t quite thought of a need to do this at compile-time yet):

# Small C++ Reflection Demo

I created a small demonstration program that explains the core ideas behind implementing a custom reflection system for C++. More might be written in this post in the future — for now I’m just storing the demo right here on this webpage:

# Automated Lua Binding

Welcome to the fifth post in a series of blog posts about how to implement a custom game engine in C++. As reference I’ll be using my own open source game engine SEL. Please refer to its source code for implementation details not covered in this article. The folder of interest would be LuaInterface.

## Introduction

Binding things to Lua is twofold: objects and functions must be able to be sent to and retrieved from Lua. Functions can be either static C or struct/class methods. Objects can be sent “by value” or “by reference”. As you can imagine it is important to be able to unify and simplify the binding process as much as possible to reduce all manual dev-work and upkeep.

## Generic C++ Functor

As with many things in a modern C++ game engine it is critical to have a generic C++ functor. Ideally this functor can wrap around class/struct methods (not only static functions). It is also possible have this functor able to refer to a function within Lua as well.

Please see my article and slides on C++ Function Binding for implementation details not covered here.

## Prerequisites

This article is on the topic of automatic Lua binding; if you’re unfamiliar with how to bind simple C functions to Lua please do a little research and come back later. The deep end of the pool is actually pretty deep!

I also suggest a working knowledge of C++ templates before trying to implement these sort of features. A working knowledge of Lua is also essential.

## Setting the Boundaries

With a scripting language it’s important to clearly define what you want to expose to script. Is the entire game in Lua? Are only specific parts accessible? What are the boundaries. It’s all too easy to get very caught up in what to send, what to implement, what not to do. Having clear boundaries of exactly what you want to do is the best way to start coding.

## Passing Objects to Lua

Objects can be passed to Lua by reference and value. A reference would consist of 4 bytes of memory to contain a pointer to some C++ memory. This allows Lua to store a “reference” to an object in C++. Most of the work involved in this type of object binding is in allow Lua to call C++ methods or functions on the pointer its storing.

The benefits of this approach are such that: calling class methods is pretty fast and shouldn’t be a worry; fairly simple to implement as most of the work is finished by creating a generic functor in C++; no hassle or upkeep when wanting to send new types of objects to Lua -each object is just a 4 byte pointer.

## Passing by Reference with lightuserdata

There are two ways I’d recommend to pass an object to Lua with: userdata and lightuserdata. A lightuserdata represents a void * in Lua and can hold a reference to an object in C++.

Here’s how one might send and retrieve lightuserdata from Lua:

This method is very fast, simple to implement and has very minimal memory overhead. Additionally lightuserdata can be compared to one another, and are equal if the underlying address is equivalent. However, one cannot attach metatables to lightuserdata and there is no sense of type safety what so ever. A lack of type safety means that if someone passes a lightuserdata into an incorrect C function the host program will likely crash.

With lightuserdata the following code is possible:

This solution will work for one, maybe two people working on a smaller project or minimal amount of code. I can imagine that the lack of type safety will be the biggest issue as time goes on.

## Reflection for Type Safety

It is possible to implement type-safety in Lua. However this requires Lua code to be maintaining type information. Lua is a scripting language meaning it ought best be used to script things. Something so integral and common as type-safety might better be implemented in lower-level C++ code.

Implementing type safety on the C++ side has two benefits: efficiency of implementation; type-safety can optionally be compiled away in release mode.

I highly recommend building yourself a simple, custom introspection library in C++. All that is really needed to start is the ability to query a type’s size and name efficiently. Please see my older article on custom Introspection or the game engine SEL for examples on how to implement such a system.

With a simple macro-based registration system one can register and lookup type information via introspection like so:

After this is complete and working (if you don’t have an implementation of introspection yet this is fine, just think of it as a black box) a small generic Variable object ought to be created. Sample code of a functional Variable object is in this post.

A Variable can be used like so:

It is important to note that the Variable itself is not a templated type!

When passing an object to Lua we can send a pointer to a Variable. As long as the Variable exist in memory in C++ the lightuserdata within Lua will point to a valid Variable. Upon retrieval of the Lua object back to C++ a type assertion can be run:

## Generic Static Function Binding

Bind C-style static functions in a generic way makes heavy use of custom introspection. The way I was originally taught was to just throw the entire binding function (in C++) at you all at once and let you suffer. Prepare to suffer as I did!

This function isn’t doing the bind, it’s what is bound. Every time a function in C++ is called from Lua, this function is called first.

An upvalue in Lua is akin to static variables in C. Using this we can attach a pointer to a generic functor to a bound C function within Lua. As Lua calls a C function this upvalue is retrieved and eventually used to actually call the C function.

The rest is just a matter of handling variables to/from Lua. In the above example the Variable object contains some helper functions call ToLua and FromLua. The nice thing about my implementation of this within SEL is that no heap memory is used during this entire process! All this code boils down to a very efficient method of generically calling C functions.

I will leave binding C++ methods as an exercise for the reader. By now you ought to have an idea of where to look for example implementation! The idea is to handle type information for the “this pointer” of the method, and pass around an actual “this pointer” to call the method.

## Calling Methods from Lua

Lets say you have an implementation that allows Lua code like the following:

A few things need to happen here. The first is that the object in question should only call methods that are actually methods of that specific type of class; one cannot simply bind all C++ methods and place functions in Lua within the global scope. Any object type could call any method type making for a lack of type-safety and dangerous code.

At this point the lightuserdata will have to be upgraded to a full userdata. Full userdata in Lua enjoy benefits such as the ability to set and modify metatables. If you’re not familiar with Lua metatables please do a little research on the topic and come back later.

A full userdata allows us to place a copy of a Variable within Lua memory, instead of just a void *. This means a temporary Variable can be used to call ToLua, instead of requiring that the Variable sent stays valid in C++ for the duration of usage within Lua.

Currently a way to create metatables for all of our C++ types is required. Assuming a linked list of all TypeInfo objects from the introspection system is available:

This loop is just creating metatables given the string names of what each metatable should be called.

After the tables are created the actual C++ methods and functions should be bound. This turns out to be really simple! It is assumed that each function and method registered within the introspection system can be passed to the function at some point (perhaps during registration of the type information):

And that’s really all there is to it! The idea here is to make sure that a type with methods sent to Lua has its userdata fixed with a metatable containing the available methods to call. When the __index metamethod is called it will search within the metatable itself for an appropriate member. Members of the metatable are the functions we bound to Lua. After they are fetched they can be called. This is what happens behind the scenes when we do:

## Passing Object by Value

Passing objects by value is actually much more difficult. The idea is to utilize tables to to store representations of the members associated with a class or struct. A table can be used to represent state of an object.

The __index and __newindex metamethods of a userdata should be set to look into the state table first. This lets users assign new values, and lets your ToLua and FromLua functions copy members from C++ to/from this Lua state table.

If a member is not found in the state table the metatable can then be searched by setting the __index metamethod of the state table to refer to the proper metatable.

All of this table indirection does incur significant overhead, however it allows objects in Lua to be used like so:

I myself have not implemented this type of Lua binding, though it is entirely possible and can be quite nice to work with. I reiterate that adding this many tables incurs both memory and performance overhead not seen with the other styles. This seems to be the only drawback.

## Conclusion

Well this post turned out longer than I expected -over 2k words! Hopefully the information was clear. It’s really nice being able to refer people to a complete and working example such as the SEL engine; it makes writing articles much easier and simpler.

# C++ Function Binding

Welcome to the first post in a series of blog posts about how to implement a custom game engine in C++. As reference I’ll be using my own open source game engine SEL. Please refer to its source code for implementation details not covered in this article.

I would like to thank John Edwards for his contribution to my education in the areas of reflection and function binding. You can thank him too by checking out their games at thatgamecompany!

Function binding in C++ is the act of being able to trigger a function given some form of input. Usually this applies to C or C++ by means of calling any function in a generic way. This can be achieved easily during compile-time in C++ by using some templates along with decltype. This is useful for:

• Script binding
• Many others

The idea is to capture the pointer to a function (or method) and pass its type around in code as a template parameter. An instance of the template type is created as a template constant. This is possible with the usual compilers as function pointers are of integral type.

The rest of the work involves creating a nice wrapper to pack arguments together and get them to the template constant pointer in a generic fashion.

I’ve created some nice slides on the topic and some demo source code. There is one slide with a video that will not play from the pdf, I attached the video to the bottom of this post. Hope this helps someone out there.

Source code demo is here. If you wish to view a fully featured example, please see my project SEL.

# Live Enumeration Editing in C++

Taron Millet, the programmer for Volgarr the Viking, created an interesting enumeration editor for their game editor used in the creation of Volgarr the Viking. This enumeration editor sparked my interest as somehow enumerations could contain within them an enumeration type. This forms a sort of tree hierarchy of enumerations! I actually emailed Taron about the editor, and he threw together a quick demo for me! If you’d like to see the demo just email me and I can send it to you.

Imagine you have an enumeration of types of items, things like breast plates, helmets, boots. Now imagine within each enumeration, lies another enumeration. You can enumerate types of helmets, types of boots and types of breast plates. Now imagine that this tree-like hierarchy is recursive with no depth boundary!

Not only was this enumeration tree really cool, but it also could be live-editted and commit back to C++ code. This is a very interesting idea and can be applied to custom editors for C++ game engines.

I’ve created my own terminal enumeration editor for a proof of concept. Here’s a video demo:

This sort of editor could be implemented in a fully featured editor, perhaps like the one Volgarr the Viking used! This is great for quick changes in gameplay and the like, and can greatly reduce the time required to setup type-safe enumerations. I myself use this editor to also reflect all constructed enumerations within a custom C++ introspection database. This allows all enumeration types to be passed to/from scripting languages, and serialized.

The implementation of such is actually super simple, and a proof of concept can be seen here: https://github.com/RandyGaul/Serialization_C. The idea is to use a single data file full of macro calls. This data file is then intentionally imported into multiple locations. Each time this import occurs different definitions of the macros are defined, thus interpreting the data in various ways upon each import. For more information about this see the link within this paragraph.

# Powerful C++ Messaging

A prerequisite to this information is most of the previous C++ type introspection stuff I have been writing about for a while now. Assuming the previous information has been covered, lets move on:

There exists a design of messaging, specifically for C++, of which has minimal downsides and many positive advantages. Ideally messaging should not involve any polling or implicitly required searching (as in searching through game space to see who to message, which requires expensive collision queries). It should also have a very intuitive usage, and not be very complex to work with.

If such a messaging system can be achieved then inter-object communication can be setup, to create game logic, within a scripting language.

Here are some slides I wrote on this topic for my university, but are available for public viewing:

# C++ Reflection Part 6: Lua Binding

Binding C/C++ functions to Lua is a tedious, error prone and time consuming task when done by hand. A custom C++ introspection system can aide in the automation of binding any callable C or C++ function or method a breeze. Once such a functor-like object exists the act of binding a function to Lua can look like this, as seen in a CPP file:

The advantage of such a scheme is that only a single CPP file would need to be modified in order to expose new functionality to Lua, allowing for efficient pipe-lining of development cycles.

Another advantage of this powerful functor is that communication and game logic can be quickly be created in a script, loaded from a text file, or even setup through a visual editor. Here is a quick example of what might be possible with a good scripting language:

In the above example a simple enemy is supposed to follow some target object. If the target is close enough then the enemy damages it. If the target dies, the enemy flashes a bright color and then acquires a new target.

The key here is the message subscription within the initialization routine. During run-time objects can subscribe to know about messages emitted by any other object!

So by now hopefully one would have seen enough explanation of function binding to understand how powerful it is. I’ve written some slides on the topic available in PDF format here (do note that these slides were originally made for a lecture at my university):