Duke Nukem 3D: Chocolate Duke Nukem 3D (PART 4 OF 4) >>
Chocolate Duke Nukem 3D is a port of Duke Nukem 3D aimed
at education. The main goal is to clarify the code
so programmers can extract knowledge easily and get a better idea of what it was to program game engines in the 90s.
Like an archeologist working on bones it was important to keep things the way they were and only the "dust" has been removed with focus
on:
- Readability : Make the code easy to understand.
- Portability : Make the code easy to compile, run and tinker with.
Note : For Windows release you may have to install the Visual C++ Redistributable 2010.
Crash reports : It you want to be able to report a crash:
- Configure Windows to save crash dump with this .reg file.
- If the game crashes you can find a dump in the folder:
C:\Users\*yourUserName*\AppData\Local. - Email me the dump.
Portability
The lack of portability was an issue now Chocolate Duke Nukem 3D compiles on Windows, Intel MacOS X
and Linux is one makefile away. Here is what has been done:
- Usage of Integral type aliases now guarantee the size of integers. The
longwas used everywhere because it was tought during development that this type would always be 32 bits wide. It is one of the reason the engine cannot be compiled in 64 bits mode. Usingint32_tfrom the standardinttypes.h. - Removal of
charfor arithmetic operations: Since it can besignedorunsigneddepending on the platform,charfor maths resulted in nasty wraparound ;charshould only be used for strings. For arithmetic, Build is now explicit withint8_toruint8_tfrominttypes.hthat guaranty signedness.
- Removal of platform dependent API. Back when SDL timer accuracy was average, the port had trouble replicating the mandatory 120ticks/frame. Now the engine either use SDL or provide a platform specific implementation for POSIX and Windows.
The code is much more portable but still not 64 bits ready: More work is still necessary in the interface between the Engine Module and the Drawing Module where memory address are manipulated as 32 bits integers. This part will require many hours and I am unsure I will be able to dedicate that much time.
Understandability
Most of the workload went into making the code easy to read. Here is a list of what was done:
Modules definition
The vanilla source code was essentially contained in three translation units:
Engine.c: Accounting for 95% of the code.a.c: Containing a crude C implementation of what was once optimized ASM.cache1d.c: Containing the caching and GRP file systems.
The code has been redistributed in units that give a clear idea of what the code inside does :
Engine.c: Now 50% of the code.display.c: SDL surfaces buffers where the screen is rendered, palette utilities.draw.c: The C implementation of the ASM routines.tiles.c: The sprite engine.filesystem.c: Anything abstracting the GRP filesystem.network.c: Multiplayer is not here.cache.c: The custom memory allocator and cache service.math.c: Most of the fixed arithmetic helper functions are here.
I was tempted to break down Engine.c into a frontend and backend: Mimicking the Quake3/Doom3 architecture with two
parts communicating via the bunch stack. In the end I judged it too far from the original spirit of the engine and dropped the idea.
Data structure
Build used struct to communicate with the Game Module via
build.h
but internally everything was done with arrays
of primitive data types: No struct and no typedef.
This has been modified and especially with regards to the Visual Surface Determination and Filesystem:
Before:
long numgroupfiles = 0;
long gnumfiles[MAXGROUPFILES];
long groupfil[MAXGROUPFILES] = {-1,-1,-1,-1};
long groupfilpos[MAXGROUPFILES];
char *gfilelist[MAXGROUPFILES];
long *gfileoffs[MAXGROUPFILES];
char filegrp[MAXOPENFILES];
long filepos[MAXOPENFILES];
long filehan[MAXOPENFILES];
// A typical GRP index entry:
// - 12 bytes for filename
// - 4 for filesize
typedef uint8_t grpIndexEntry_t[16];
typedef struct grpArchive_s{
int32_t numFiles ;//Number of files in the archive.
grpIndexEntry_t *gfilelist ;//Array containing the filenames.
int32_t *fileOffsets ;//Array containing the file offsets.
int32_t *filesizes ;//Array containing the file offsets.
int fileDescriptor ;//The fd used for open,read operations.
uint32_t crc32 ;//Hash to recognize GRP archives: Duke Shareware, Duke plutonimum etc...
} grpArchive_t;
//All GRP opened are in this structure
typedef struct grpSet_s{
grpArchive_t archives[MAXGROUPFILES];
int32_t num;
} grpSet_t;
Symbols name sanitization
Variable names have been modified when they provided little clue about their usage:
Before:
static long xb1[MAXWALLSB], yb1[MAXWALLSB], xb2[MAXWALLSB], yb2[MAXWALLSB];
static long rx1[MAXWALLSB], ry1[MAXWALLSB], rx2[MAXWALLSB], ry2[MAXWALLSB];
static short p2[MAXWALLSB], thesector[MAXWALLSB], thewall[MAXWALLSB];
enum vector_index_e {VEC_X=0,VEC_Y=1};
enum screenSpaceCoo_index_e {VEC_COL=0,VEC_DIST=1};
typedef int32_t vector_t[2];
typedef int32_t coo2D_t[2];
// This is the structure emitted for each wall that is potentially visible.
// A stack of those is populated when the sectors are scanned.
typedef struct pvWall_s{
vector_t cameraSpaceCoo[2]; //Camera space coordinates of the wall endpoints. Access with vector_index_e.
int16_t sectorId; //The index of the sector this wall belongs to in the map database.
int16_t worldWallId; //The index of the wall in the map database.
coo2D_t screenSpaceCoo[2]; //Screen space coordinate of the wall endpoints. Access with screenSpaceCoo_index_e.
} pvWall_t;
// Potentially Visible walls are stored in this stack.
pvWall_t pvWalls[MAXWALLSB];
Comments and documentation
- Documentation : Since the JoFo forum posts are gone, I hope the Build Internals page will helps developers to have an idea of the high level architecutre of the engine.
- Comments : This is the point where I tried to invest most of the time. I am a huge believer of a lot of comments in code (Dmap is a great example of source with more comments than statements).
Magic numbers
I haven't had the time to remove all the magic numbers. Change decimal literal in favor of enum or #define
would improve readability a lot.
Memory allocation
Chocolate Duke attemps to avoid global variables. Especially if they are used only for the lifetime of a
frame. In those cases the memory used will be on the stack:
long globalzd, globalbufplc, globalyscale, globalorientation;
long globalx1, globaly1, globalx2, globaly2, globalx3, globaly3, globalzx;
long globalx, globaly, globalz;
static short sectorborder[256], sectorbordercnt;
static char tablesloaded = 0;
long pageoffset, ydim16, qsetmode = 0;
/*
FCS:
Scan through sectors using portals (a portal is wall with a nextsector attribute >= 0).
Flood is prevented if a portal does not face the POV.
*/
static void scansector (short sectnum)
{
//The stack storing sectors to visit.
short sectorsToVisit[256], numSectorsToVisit;
.
.
.
}
Note : Be careful when using a stack frame to store big variables. The following code ran well when compiled on clang
and gcc but failed with Visual Studio:
int32_t initgroupfile(const char *filename)
{
uint8_t buf[16] ;
int32_t i, j, k ;
grpArchive_t* archive ;
uint8_t crcBuffer[ 1 << 20] ;
printf("Loading %s ...\n", filename) ;
.
.
.
}
A stack overflow occurred because Visual Studio reserves only 1MB for the Stack by default. Trying to use 1MB overflowed
the stack and that made
chkstk very unhappy. This code ran fine with Clang on Mac OS X.
Source code
The source code is available on github.
Add a comment
Comments (51)
It's a pitty that they could not just refurbish the graphics, add some new maps and release a new Duke. What they released was a "let's get things done" game and called it "Duke Nukem Forever".
Maybe there will be a "Duke Nukem Community Edition" with all the cool gameplay...
Thanks for your work!
same problem when running Starcraft or Diablo II, the easiest solution is to kill explorer.exe before run the game (use taskmanager or create a batch command :)
May I ask you to write an in depth article on how you actually *read* code? Where do you start, how much notes you make, maybe you document the code you read, and so on. In my experience, reading code is an essential skill for any programmer, but there seem to be absolutely no resources explaining how to do it properly. I've tried reading quake3 code (three times) and tried reading eAthena code (open-source server engine for MMORPG Ragnarok Online) but after couple days I've got lost in detailes and in the end had no clue about how thing works. You are on the other hand extremely skilful at the task and your insights are very valuable for an open community.
I laugh about your note about variables/files with numbers, I totally agree with you. I always remaind my coworkers of that (after some year developing seriously and reading books like Clean Code made ponder the value of good names).
So, what's the next step? do you have any engine you want to review next?
It's a shame that any of the Unreal games doesn't release their source codes.
I can't read the article via Instapaper app, and the page is otherwise unreadable on mobile (and I don't have time to read at my desk)
I've always wondered, what software do you use to make your diagrams?
Excellent post btw, nice deep analysis of the source and render mechanics. I remember looking at ENGINE.C when it was first released and indeed did not feel encouraged to continue investigating :) But I respect that he got a post-DOOM engine working smoothly on such feeble hardware. It took some sacrifices, especially then. Compilers now can turn a ton of types, lambdas and whatnot into a tight loop but back then you really had to write high-level assembly to get stuff fast. Props to Ken and you!
As always a great read!
Once I saw it over at HackerNews while at work, I knew I had to make a short day :-)
One thing: Could you elaborate on why using cross products makes a difference to using a dot product if you have fixed-point/integer math?
Also, two small typos in the related source code snippet:
* If I am not color-blind, the one vector is blue and not green. ;-)
* The return-statement should compare against the 2D zero-vector not a scalar, right?
Cheers,
Daniel
http://web.archive.org/web/*/http://www.jonof.id.au/forum/
But why the hell would you use quicktime videos... :D
Chrome give me an alrt that is due to a plugin named : quick time
use html video tag or flash for video other solution are lame !
Thanks!
I wonder what tools you use to make your diagrams.
I appreciate to read your comments and notes on different games, its very nice of you to take the time and read and share it with all of us.
Have a good life
//David
Thanks.
I also enjoyed, very much, your article's page presentation.
Nicely done and pleasant to view.
I had no problem viewing any of the images or QT videos.
Depending on the editor, you have at most 24 lines of 80 columns on a VGA screen. The inside function you mention is 20 lines long - pretty close to that limit. The desire to keep logic sections within the size of the screen is much more important than comments for other people years down the line to enjoy.
Also, regarding globals: While I'm not sure about 486 specifically (I'm too young), a global array of a primitive type (1, 2, or 4 bytes per element) is very fast to access. I believe the address of the element could be calculated as part of the memory fetch instruction.
An array of structs, whether as a global or on the stack, would probably require 2 instructions to calculate the memory address, and a third for the actual fetch.
Aditionally, the size of the stack would have been tightly constrained.
Any plans to mention the active Source Port EDuke32?
http://www.eduke32.com/
OpenGL is very much improved, there is now true Room over Room capabilties and much more.
The team has an very active forum
http://forums.duke4.net/
Best
Taamalus
* Putting as much code into a single file as possible was a good way to help the primitive compilers do some optimisations & inlining
* Arrays of scalars instead of structs are simply much faster to iterate through given limited CPU caches
* The example math method is straightforward bitmask operations on the sign bits
* Ahh, hand-coded assembly. I'll admit it takes the right kind of masochist to read it :)
* Nothing magic about those numbers! Just bitmasks.
* IDEs have no excuse to be slow these days. DJGPP etc handled huge files just fine back then!
I think you mean 120Hz, not 120 ticks per frame (that would be far too much IRQ overhead). But the PIT native frequency was 1193180Hz. You set a divisor value from 0-65535 which determines the frequency of IRQ0. You can't get 120Hz exactly (more like 120.002Hz) so it's unusual to quantize rendering / physics to this as you get graphics "tearing". What did they use this timer for?
Thanks for sharing all this knowledge.
I get C:\Duke3D\ChocolateDuke3D.exe is not a valid Win32 application when tying to start.
When a get a bit more time I will try and get this complied locally, both on Win XP and Linux(Raspberry Pi)
The Videos are in the folder fd.fabiensanglard.net/duke3d/movies/
the 3 *.m4v-files
VLC plays them just fine
Maybe, if you're running out of code bases to review, you should turn to Aleph One. Its kind of "Chocolate Marathon" ( http://en.wikipedia.org/wiki/Marathon_2:_Durandal ).
Its surprising that most of my pet game engines of the past have similar algorithms (in fact, the "current sector" tracking is exactly the same) to Build.
Greetings from Rio ;-)
"The renderer does not rely on recursive function (the way Doom did when walking the BSP). Instead a stack and a loop is used (for sector flooding)" (from your raw notes)
There is no recursion based on machine stack, but rather on software stack. Maybe this is because (if my memory serves me well) Ken prototyped the engine in QuickBASIC and (from my faded memory again) it didn't have recursion until 4.0;
Or perhaps not...maybe was stack size thing from those old dark (and very fun) days?
QuickBasic was a fine "IDE" for some quick hacking. I grew making lousy 3D games in it myself. My biggest inspiration at the time was Ken, but I wasn't aware he used the very same tool I was;
Very interesting review as usual. I already took a look at code some years ago and discover most inner workings but some things remains mysterious, your review cleared them out.
I expected to see an explanation on how mirrors works (why they require an additional empty room with special texture). Still unclear to me.
One interesting trivia : the way updatesector() works caused some well know glitch/cheat during the game, that was used a lot in multiplayer (warp glitch) : if you make player run very fast in a place were sectors are very small (eg : stairs), updatesector get lost and start scanning all sectors. If there is two (or more) overlapping sectors where player is, game sometimes choose the wrong sector (depending their order), and it will teleport player there.
see video : http://www.youtube.com/watch?v=VUOhZIdCSiI
Ken was a true whiz kid - imagine a threesome, hopefully limited to engine concerns, between Carmack, Abrash and Silverman in the 90ies... the arch software rasterizer could have been born ^^
Thanks for your excellent review and bringing the code to life in classic fashion!
I want to play the Atomic Edition but I can't patch to 1.5, please make Chocolate Duke Nukem 3D work for v1.4!
Note that the intro demos work fine.
