Doom3 Source Code Review: Profiling (Part 4 of 6) >>
XCode comes with a great tool for profiling: Instruments. I used it in sampling mode during a playing session (removing the game loading and level GPU pre-caching altogether):
Overview

The high level loop shows the three threads running in the process:
- Main thread where gamelogic and rendition occur.
- Auxiliary thread were inputs are collected and sound effects are mixed.
- Music thread (consuming 8% of resources), created by CoreAudio and calling
idAudioHardwareOSXat regular intervals (note: sound effects are done with OpenAL but do not run in their own thread).
Main Thread
The Doom 3 MainThead runs...QuakeMain! Amusingly the team that ported Quake 3 to Mac OS X must have reused some old code. Inside the time repartition is as follow:
- 65% dedicated to graphic rendition (
UpdateScreen). - 25% dedicated to gamelogic: This is surprisingly high for an id Software game.
Game Logic
The gamelogic occurs in gamex86.dll space (or game.dylib on Mac OS X):
The game logic account for 25% of the Main Thread time which is unusually high. Two reasons:
- I.A: The virtual machine is run and allows entities to think. All of the bytecode is interpreted and the scripting language seems to have been overused.
- The Physic engine is more complex (LCP solvers) and hence more demanding than previous games. It is run on each object and include ragdoll and interactions solving.
Renderer
As previously described the renderer is made of two parts:
- Frontend (
idSessionLocal::Draw) accounting for 43.9% of the rendition process. Note thatDrawis a pretty poor name since the frontend does not perform a single draw call to OpenGL ! - Backend (
idRenderSessionLocale:EndFrame) accounting for 55.9% of the rendition process.
The load distribution is pretty much even and it is not that surprising since:
- The frontend performs a lot of calculation with regard to Visual Surface Determination.
- The frontend also performs model animation and shadow silhouette finding.
- The frontend upload vertices to the GPU.
- The backend spends a lot of time setting up parameters for the shaders and communicating with the GPU (i.e: submitting triangles indices or per vertex normal matrix for bumpmapping in
glDrawElements).
Renderer: Frontend
Renderer FontEnd:
No surprise here, most of the time (91%) is spent uploading data to the GPU in VBOs (R_AddModelSurfaces). A little bit of time (4%) is visible when going through areas, trying to find all interactions (R_AddLightSurfaces). A minimal amount (2.9%) is spent in Visual Surface Determination: Traversing the BSP and running the portal system.
Renderer: Backend
Renderer BackEnd:
The backend obviously triggers a buffer swap (GLimp_SwapBuffers) and spend some time synchronizing (10%) with the screen since the game was running in double buffering environment.
5% is the cost of avoiding totally overdraw with a first pass aiming to populate the Z-Buffer first (RB_STS_FillDepthBuffer).
Flat stats

If you feel like loading the Instruments trace and exploring yourself: Here it the profile file.
Add a comment
Comments (75)
Very nice article, thanks for sharing your work, I hope you will write a review of id tech 3.
I cloned repo and tried to find unit and integration tests, but couldn't. Did I miss something?
It's quite amazing that such a big project could be maintained and developed with reasonable velocity without tests.
I found the way Doom 3 did the interfaces on the screens very impressive and immersive - maybe even more than the lighting effects. Can you elaborate a bit on how they did that?)
Of the many forks you mentioned. Can you name some that are actively improving the code and adding features?
"...Carmack and we was nice..."
I believe you meant to say "he was nice"
Great article, nonetheless!
Great read... again!
It is really astonishing that there is one render-pass per light and still run so fluid on 2004 graphics hardware.
Definitely looking forward to a review of idTech3. Besides from the write-up and nice graphics it might be a an "easy" job for you compared to anyone else. With the knowledge of Quake 1 and 2 this will also make it possible to set everything into perspective nicely.
Cheers,
Daniel
I've enjoyed your past code reviews so much I've gone back and read them more than once; I'm certain I will read this at least a few times!
I sincerely appreciate the time you took to write this. I hope that you have the time to review iDTech3 as well, especially since it will provide an opportunity to compare and contrast how the codebase has progressed since then.
You ill get a cookie for that ^^
Can you explain how the use of statically instantiated instances of system level objects and the use of a pointer to their abstract base classes avoids the usual vtable lookup overhead?
However, id's Trinity is named after the Dallas Trinity River, as Carmack explained in this interview http://www.firingsquad.com/features/carmack/
You've got some very interesting stuff here :)
Great job for this very interesting article.
One question, you said :
"Abstraction and polymorphism are used a lot across the code. But a nice trick avoids the vtable performance hit on some objects."
Can you tell us a little more about this 'nice trick' ?
Thanks !
A Quake3 review would also be very nice (or even ioquake3?).
// the following is Mr.E's code
// Note from FAB WHO is Mr.E ?
FWIW I'm going to hazard a guess and say that Mr.E is "Mr. Elusive" a.k.a. Jon Paul van Waveren. =)
http://element61.blogspot.de/2005/08/looking-at-quake-3-source-part-1.html
http://element61.blogspot.de/2005/08/looking-at-quake-3-source-part-2.html
http://element61.blogspot.de/2005/09/looking-at-quake-3-source-part-3.html
A lot of work was done on ioquake, and there is a nice improve code aswell (normal maps, png textures and so on):
http://www.moddb.com/mods/etxreal
DOOM 1.3.1.1304 win-x86 Jun 11 2012 15:14:43
4081 MHz AMD CPU with MMX & SSE & SSE2 & SSE3 & HTT
15840 MB System Memory
0 MB Video Memory
Winsock Initialized
Found interface: {53C6CF71-C4AA-4430-9C4C-DAF112BEA668} Intel(R) 82583V Gigabit Network Connection - 192.168.1.130/255.255.255.0
Found interface: {2B7C12A1-7F58-4FAE-B3EA-1A35703E9055} VirtualBox Host-Only Ethernet Adapter - 192.168.56.1/255.255.255.0
Found interface: {0D903444-3D1B-4028-850D-2058C798DBBF} VMware Virtual Ethernet Adapter for VMnet1 - 192.168.88.1/255.255.255.0
Found interface: {4EB006CA-2A77-4C41-9349-05E7EDCEE56B} VMware Virtual Ethernet Adapter for VMnet8 - 192.168.109.1/255.255.255.0
Sys_InitNetworking: adding loopback interface
doom using MMX & SSE & SSE2 & SSE3 for SIMD processing
enabled Flush-To-Zero mode
enabled Denormals-Are-Zero mode
------ Initializing File System ------
Loaded pk4 C:\Program Files (x86)\DOOM 3\base\game00.pk4 with checksum 0xf07eb555
Loaded pk4 C:\Program Files (x86)\DOOM 3\base\pak000.pk4 with checksum 0x28d208f1
Loaded pk4 C:\Program Files (x86)\DOOM 3\base\pak001.pk4 with checksum 0x40244be0
Loaded pk4 C:\Program Files (x86)\DOOM 3\base\pak002.pk4 with checksum 0xc51ecdcd
Loaded pk4 C:\Program Files (x86)\DOOM 3\base\pak003.pk4 with checksum 0xcd79d028
Loaded pk4 C:\Program Files (x86)\DOOM 3\base\pak004.pk4 with checksum 0x765e4f8b
Current search path:
C:\Program Files (x86)\DOOM 3/base
C:\Program Files (x86)\DOOM 3\base\pak004.pk4 (5137 files)
C:\Program Files (x86)\DOOM 3\base\pak003.pk4 (4676 files)
C:\Program Files (x86)\DOOM 3\base\pak002.pk4 (6120 files)
C:\Program Files (x86)\DOOM 3\base\pak001.pk4 (8972 files)
C:\Program Files (x86)\DOOM 3\base\pak000.pk4 (2698 files)
C:\Program Files (x86)\DOOM 3\base\game00.pk4 (2 files)
game DLL: 0x0 in pak: 0x0
Addon pk4s:
file system initialized.
--------------------------------------
----- Initializing Decls -----
------------------------------
------- Initializing renderSystem --------
using ARB renderSystem
renderSystem initialized.
--------------------------------------
4966 strings read from strings/english.lang
Couldn't open journal files
execing editor.cfg
execing default.cfg
execing DoomConfig.cfg
"\\" isn't a valid key
couldn't exec autoexec.cfg
4966 strings read from strings/english.lang
----- Initializing Sound System ------
sound system initialized.
--------------------------------------
found DLL in pak file: C:\Program Files (x86)\DOOM 3\base\game00.pk4/gamex86.dll
copy gamex86.dll to C:\Program Files (x86)\DOOM 3\base\gamex86.dll
idRenderSystem::Shutdown()
Shutting down OpenGL subsystem
...shutting down QGL
wrong game DLL API version
Hope this helps, have fun!
I think that it would be super cool that you do the review on idTech3 (for the sake of completitude? he).
And yes, a Quake 3 code review would be awesome ;)
Keep going, you rock.
Nice to see someone actually took down some deep steps into Johns system.
Just inspirating :-)
There is still one thing i don't not understand : network update code (idAsyncNetwork::RunFrame()) seems to be called only once in main loop.
Thats means someone with a pretty slow graphic card (eg : only capable of 10 fps) will only send packets to server at same rate.
Other players would see him with "laggy movement" while it could be a lot better.
Why idAsyncNetwork::RunFrame() could not be placed in a separate thread (like input and sound mixing) and thus packets would be send to server independently from updating the world and rendering ?
IMAO mostly user commands are send to server, so there is no need to have world updated (using game->runframe()) before sending this information.
1. Shaggy movement are avoided via client position prediction.
2. There is little value in sending commands at 60Hz is the player cannot even see the result of its actions.
ftp://ftp.idsoftware.com/idstuff/doom3/source/CodeStyleConventions.doc
is there any other place this can be downloaded from?
http://fd.fabiensanglard.net/doom3/CodeStyleConventions.pdf
Big Thx
Thank you for sharing with us !
anyways... like it or not, good or bad, when your main target platform is MS Windows, your best option is Visual Studio !
Of course you can use C and C++ with VS.NET.
But when the .NET framework and associated Visual Studio were released, there were strongly associated with C#, VB and J++ (see CLI Infrastructure). Developers were strongly encouraged to use things that would have tied the codebase to Windows but the id software dev team did no use any of those Microsoft only features and I found it amusing.
Hi there, thanks for the reply :)
I do agree with your statement, Microsoft naturally intended VS to be used mainly for Windows-only purposes, but as I said in my first comment, in general, if your main target platform is MS Windows (or Xbox) or even if it is not, but you're developing on an MS Windows machine (personally I've been doing iOS and Android development on VS2010), your best option in terms of both the IDE and the C/C++ compiler is Visual Studio. In particular Visual Studio's profiler, performance analyzer, code analysis, and even debugger -at least on MS Windows- is truly unmatched.
I'm not saying it is the best option that could exist, of course people with another mindset than Microsoft's could come up with something much better, but the truth is that at least on the MS Windows front this hasn't happened yet. VS is the best option that does exist at the moment until someone or some company comes up with something better.
So I still don't get your point but that doesn't really matter :)
I'm truly amazed by this review and your notes, believe it or not I have made this page as one of my many browser's home pages ! again, thanks for posting your notes and this review. I really needed it and it has helped me a lot.
Thanks for digging into the source code to write this. I wish I took up programming in my earlier years to understand the more technical aspects of this post :)
Congratulations!