random.h is a header only random number generation library for games, image processing and procedural content generation.

A few years back I wrote some handy floating point random number generation routines for a raytracer.  Time and again, I have needed these routines for other programs I have been working on.  It seems like a good idea to wrap them up in a small header and release them in the public domain for others to use too.

Each RNG function takes an 64bit integer as a seed value and permutes it in place.  The resulting bits are used for the generation of the result.  I like this design as it keeps the RNG functions stateless and thread safe.  You just have to keep track of the uint64_t you have been using as your RNG stream.

Included are various 1, 2 and 3 dimentional random number generators with a range of different distributions.

Triangular random noise is very usefull when dithering audio and images during the process of bitdepth conversion.  Typicaly, triangular noise is added to the signal immediately before type casting (at the same magnitude as the LSB of the destination bit depth).

The 2d and 3d vector RNG functions generate statisticaly unbiased vectors inside and on the unit circle or sphere (Generating vectors by randomizing the components and normalizing would bias the vector to the corners of unit cube).  I have used these functions successfully in the past for generating ambient occlusion maps.

I wrote a little program to generate some images to display the output distribution of each function.  A 3d point is generated by filling its component from the RNG function under test, and the resulting point is projected in space, as well as projected on each component axis (the red, blue and green planes).  Hopefully the images help a little to convey the distribution of each function.

Again, you can find the library here:

randfu() - unsigned random 
range [0,1]


randfs() - signed random 
range [-1,+1]


trandfs() - triangular random 
range[-1,+1], tends to 0


grandfs() - gaussian random 
range ~[-1,+1], tends to 0


vrand3d() - 3d vector random 
magnitude range: [0, 1]


nvrand3d() - normalized 3d vector random 
magnitude = 1


vrand2d() - 2d vector random
magnitude range: [0,1]


nvrand2d() - normalized 2d vector random
magnitude = 1


Binary Heap

Despite lack of activity here I have been very busy programming cool stuff which I hope to get around documenting and sharing with you all.

I need to implement the A* algorythm for one of my projects. Its a fairly common to speed up of the implemention to use a binary heap for the open list because of its O(log(n)) insertion and O(1) removal properties.  As I code it up I thought it would be nice to share my binary heap implementation, something which can be a bit fiddly to get right in practice.  I wrote a little fuzz tester to nuke it, so that I can be fairly sure it works correctly.

Gist: BinaryHeap

I chose to make the size fixed at compile time for simplicity, and becaues I am happy to terminate a pathfinding search that exceeds a fixed open list size, as old school games used to do.  It should however not require much work at all to grow the heap size when a node is pushed when currently at full capacity.

I will also leave you with this picture from my latest project:


Fez PAK file Unpacker

I really like the game Fez for so many reasons and I am also the kind of guy that likes to take apart the things I like in order to understand them better.

This morning I decided to poke around at the data for Fez and see what I could discover.  As it turns out there are only like 3 files, large ones (~150mb), with the extension ‘.pak’.  For games, these large ‘.pak’ files are often used to collect many small game assets into large files, which has many advantages, such as easier distribution, faster loading, file system abstraction and patching.

Opening up ‘Essentials.pak’ in HxD hex editor, I saw the following:


The abundance of strings was a good sign tipping me off that this data is not compressed in any way, making my job of extracting the files vastly easier.  The first thing I went after was the path ‘other textures\fullwhite’ since it seems like the name of the first file in the ‘.pak’ file.  The string was missing a null terminator, so the size must lurk somewhere else.  Preceding the string, was a singe byte 0x18, which sure enough matched the length of the string.  I now understood the name of each file in the ‘.pak’, but I still need to know the length of the file itself.  Scanning down the file, I can see what looks like the start of another entry, which is good, since it gives me an idea of the length of the first file.  The next file starts at 0xE0, so the length of the file must be smaller, and smaller still when we remove the size of the name and ‘.pak’ file header.  In total I expect it to be around 0xC0 in size, and sure enough after the string there is 4 bytes with 0xCB.

Its worth noting that any assumptions I make I am sure to verify by looking at other entries in the ‘.pak’ file and seeing if the rules I can see still hold tight.  In this case, they do!

I could write a program to start extracting data out of the ‘.pak’ file until it fall off the end of the file, but there will most likely be a way to find the number of files within this ‘.pak’ file.  Most likely it is at the start of the file, generally known as a header.  The rest of the file has been very simple so I wouldn’t expect this to be any different, the first 4 bytes seem suspiciously close to being a file count.

Below I have colored the hex view to highlight the interesting parts.  In blue is the file count.  Light orange is the string length.  Green is the file name and path.  Darker orange is the file size in bytes.  The data between two entries is the file data itself.


It didn’t take long to code up a simple extractor for these ‘.pak’ files.  Some of the goodies inside are all of the stems for the music tracks stored as Vorbis (ogg) streams.  These have no file extension, but VLC is able to figure out how to play them just fine.  I will in time try to poke around at texture, trixel and the map data.

The source can be found here:  FezUnpack.cpp

An executable is here: FezUnpack.exe

Playing with MicroThreads

I recently bought Game Programming Gems 2 and was captivated by an article titled ‘Micro-Threads for Game Object AI’.

A really neat technique is describes that can performing very fast cooperative multitasking, also sometimes referred to as co routines.  The heart of this technique involves manipulating the processor stack pointer to switch between multiple contexts of execution.  Its very similar in concept to true multitasking, however each thread decides when to share the processor.  The concept is simple enough, but really so powerful and radically different from many other techniques for AI I have used.  I knew I would have to program up a text to play with this idea.  Using this method, each object in my game can pretend it is running in its own thread and I am sure this can find application elsewhere too.

The article provides a nice description, but doesn’t provide enough information to directly copy an implementation, and my copy arrived without the CD, but that is fine however and I am having fun of working it out my own.  I have made a few projects in the past which manipulated the CPU registers to perform various tricks, so I can build upon that knowledge.

My first implementation appeared to operate correctly until exit where my call to delete[] threw up an assert.  This is because the fake stack was getting sprayed with bad data somehow from within my virtual thread. I will be spending lots of time in the debugger until I find out the cause.

Update 1:

I believe that the debug run time version of printf() is writing past the end of the 1kb stack, causing a buffer overflow. Hmm….

update 2:

I went back and read the article all over again and there were many hints that I simply didn’t have enough stack space at all.  The author also proposed a very nice improvement, where we effectively extend the stack size by copying each micro thread into much larger stack before execution.  Thus the Micro Thread code can make use of a large stack space, but individually they only take up a small amount of space.  After execution, we extract the smaller part of the stack again.  This places the limitation that _yield() cant be called deep in some code in the thread, but that is hardly a restriction at all.

Here is my demo code for anyone that want to have a look at its implementation.  I wrapped it up with a little API as well so it should be somewhat readable.


update 3:

My brain must have been thinking of improvements overnight since I woke up with three ideas I had to experiment with. In each case it was to address the stability and protect against stack overflows. The most major improvement is the use of VirtualProtect(), which is used here to set the bottom of the shared stack region to a guard page. This wastes 4k of stack space, but it is well worth it since now a nice exception is thrown if we write to the stack anywhere in the bottom 4k.

The second improvement, is to check the value of the stack pointer upon return from a micro thread. If it falls within the space allocated to each thread then everything is fine, otherwise we will need to allocate more stack space for that thread. If this check fails, and esp lies outside of our thread space, then we can be sure that we wont be saving all of that threads state during the task switch. That would lead to very very bad things.

A final improvement is to address what happen when the micro thread function returns. Currently, there is no return address on the stack, so execution could jump literally anywhere.
The solution is to ensure that we have a valid address for this function to return to. We can set the return address to a custom function that can handle the situation on one of two ways. We can setup the stack and processor state so that it will execute the micro thread function once more, thus it will feel like our micro thread function is permanently called. Alternatively we can mark that thread as finished, switch back to our main thread, and then kill the micro thread cleanly. Since both options have merit, I will implement both of then and let the user specify the handling during thread creation.

update 4:

I am now playing with Vectored Exception handlers to see if I can recover from stack overflows, and automatically terminate a malfunctioning thread.  I have confirmed that an exception is being thrown when I intentionally overflow the stack from within a micro thread.  It shouldn’t be too hard to switch the context back from within a vectored exception handler.  I remember having played around with something similar, having self modifying code from an exception handler if a debugger is detected.  According to the original article, it states that structured exceptions no longer work correctly from micro threads since no handlers are on the stack, perhaps this is something I can look into addressing also.

update 5:

Here is the latest version of the library provided as a source release.  The vectored exception handler really works amazingly well, terminating any micro thread that attempts to overwrite its stack.  If a micro thread returns from its function, the thread will terminate as well.  There are many more stability improvements and validity check making this quite a viable option for use in any project now.  I have included a small test program also so that the functionality can be demonstrated.


Lode Storm – Implementing classic fog of war

Last night I couldn’t sleep, and one of the reasons was that my mind was busy trying to explore the problem of how to implement FOW in my game Lode Storm for this months 7 day mini Ludum Dare.  I got up again and at 1:30am set about solving this problem in a nice manner.  After about two hours, the results looked suitably retro and I had some fun designing a neat algorithm in the process.  Here is the result:


Any RTS game aspiring to the classics of the genre would not be complete without some Fog of war (FOW) effect.  Warcraft Orks and Humans is a game that very much rely on FOW, and has a great rendering style for it too.  I took this as my basis and set out to mimic the visual effect.

It is immediately obvious that in this case, the FOW is just another tile map, as an overlay to the main game map.  While each tile is either in fog or not (binary), some extra processing has occurred to make a visually smooth transition around the hard boarder.

The stages seem evident:

  1. Within the game logic update the binary FOW map
  2. Process the binary FOW map to derive a visual FOW map
  3. Render using the same process as a regular tile map

Stages 1 and 3 are quite clear and shouldn’t require much discussion.  Stage 2 is however much more illusive and I will show you how I solved that in just a moment.

For stage 3 we require a visual overlay that will be rendered with a transparency mask to provide a pixel dissolve effect for the transition.  I spent some time in Photoshop and knocked up the following map:


All of the tiles needed for a complete FOW effect are there, and in fact there is a little duplication in the corner tiles, but that is not important.  In stage two we need to take the binary FOW map and through analysis of each game time, produce and index corresponding to a tile in this image.

For stage 2, a few ideas to explore came to my mind.  I remember coding up a demo that performed terrain analysis using a Sobel filter, a very similar operation, however that code was written long ago and is on an eternal hard drive I don’t have access to at the moment.  Browsing over the Wikipedia page, it wasn’t immediately clear to me how to approach the problem with a Sobel filter.  The second option that came to mind is to employ a variation of the Marching Squares algorithm.  I haven’t programmed anything using this technique before, however it has always been something I have kept in mind to try one day.

As I understood it Marching squares assigns a unique code to each tile based on the tile values sampled at each corner.  To clarify this point, for each corner of the a tile we set it to true if it touches any other tile that is in FOW, otherwise false.  The image below shows that if any of the yellow squares have their bit set in the FOW map, then the corresponding code will be set to true.  This can be done by summing each tile around the corners using a bit wise OR operation.


Once these four Boolean values for the quadrants A, B, C and D have been found, we can proceed to produce the marching squares code.  This can simple be done by shifting and summing each of the bits, to produce 16 possible values.  Therefore the output value would be:

int out = (A<<3) | (B<<2) | (C<<1) | D;

The last step is to take the marching square code and use a translation table, mapping each marching squares code to a FOW image index.

In the end here is the code I came up with.  Please note that, ‘conv’ is the conversion table for the marching squares code to the image map.  In the current version of Lode Storm I used a modified version of the above FOW image, so it does not correspond to that.  You will notice that I perform a binary AND with 0x8, in the final step of my marching squares code computation, and this is because my FOW map is stored in one bit of a larger array of flags for each tile, so the 0x8 isolates only the FOW bit.

// tile byte flags layout
//    0xF                  0x0
//    Z  L  0  0  H  C  W  E
//  0x80 - Z - tile locked for units    (unit on tile)
//  0x40 - L - lode locked              (lode on tile)
//  0x08 - H - human fog of war         (binary, not the aesthetic map)
//  0x04 - C - computer fog of war        " "
//  0x02 - W - water                    (block move not LOS)
//  0x01 - E - elevation                (block move and LOS)
inline int fog_getIndex( int x, int y )
    // marching squares conversion table
    uint8 conv[] =
        13,  1,  0,  8,
         2, 13, 10,  4,
         3,  9, 13,  5,
        11,  7,  6, 12
    // convenience
    uint8 *t = tile_flags + (x + y * 32) - 33;
    // surrounding squares
    uint8 q[ 8 ] =
        t[00], t[01], t[02],
        t[32],        t[34],
        t[64], t[65], t[66]
    // sum to find out quadrant values
    int a  = q[0] | q[1] | q[3];
    int b  = q[2] | q[4] | q[1];
    int c  = q[7] | q[6] | q[4];
    int d  = q[5] | q[3] | q[6];
    // turn into marching square code
    int v  = (a & 0x8) >> 0;
        v |= (b & 0x8) >> 1;
        v |= (c & 0x8) >> 2;
        v |= (d & 0x8) >> 3;
    // index conversion table
    return conv[ v ];

Additionally, there is one small omission from the code above.  If a tile has its FOW bit set in the FOW map, then it should immediately skip this processing and be set to straight black.  Tiles should only be processed using the above method if they are not part of the hard FOW.

Lode Storm – 7 Day RTS Compo

I have for the last few days been squinting in the dark at my monitor in the early hours of the morning.  I couldn’t resist entering this months Mini Ludum Dare since the challenge was to make a RTS game in 7 days.  An RTS game is something that I have always have wanted to make, from a technical point of view as much as anything.  I spent countless hours playing C&C, Dune2 and dungeon keeper in my childhood and the AI in this genre in particular always fascinated me.

My initial research on RTS AI directed me to some great papers written about potential fields and their use in the RTS genre.  This is a method I had read about previously, but these papers really gave me the vision for just how powerful they could be.  The papers can be found a the bottom of this great article at AIGameDev. http://aigamedev.com/open/tutorials/potential-fields/.

All the programming I had done on my framework is now being tested to the limit for this project, and so far (touch wood) it hasn’t failed me. Additionally, I am particularly proud of my efforts on this project because I have made all of the graphics myself from scratch and this really is my most complete game to date, since I usually get sidetracked while working on the technical details of a project.  Provided below is a screenshot of the current work in progress.  Several potential fields are being displayed for debugging by eye in the lower left corners.  Top left shows the fog of war maps for both the human player and the AI player respectively.