Doom3 Source Code Review: Scripting VM (Part 5 of 6) >>
From idTech1 to idTech3 the only thing that completely changed every time was the scripting system:
- idTech1: QuakeC running in a Virtual Machine.
- idTech2: C compiled to an x86 shared library (no virtual Machine).
- idTech3: C compiled to bytecode with LCC, running in QVM (Quake Virtual Machine). On x86 the bytecode was converted to native instructions at loadtime.
idTech4 is no exception, once again everything is different:
- The scripting is done via an Object Oriented language similar to C++.
- The language is fairly limited (no typedef, five basic types).
- It is always interpreted via a virtual machine: There is no JIT conversion to native instruction like in idTech3 (John Carmack elaborated on this during our Q&A).
A good introduction is to read the Doom3 Scripting SDK notes.
Architecture
Here is the big picture:
Compilation : At loadtime the idCompiler
is fed one predetermined.script
file. A serie of #include
directives will result in a script stack that contains all the scripts string and every functions source code. It is scanned by an idLexer
that generates basic tokens. Tokens enter the idParser
and one giant bytecode is generated and stored in idProgram
singleton: This constitute the Virtual Machine RAM and contains both .text
and .data
VM segments.
Virtual Machine : At runtime the engine will allocate real CPU time to each idThread
(one after an other) until the end of the linked list is reached. Each idThread
contains an idInterpreter
that saves the state of the Virtual CPU. Unless the interpreter go wild and run for more than 5,000,000 instructions it will not be pre-empted by the CPU: This is collaborative multitasking.
Compiler
The compilation pipeline is similar to what we can find reading any compiler such a V8 from Google or Clang except that there is no preprocessor.
Hence functions such as "comment skipping", macro, directive (#include,#if) have to be done in the lexer and the parser.
Since the idLexer
is reused all across the engine to parse every text assets (maps, entities, camera path) it is very primitive. As an example it only return five types of tokens:
- TT_STRING
- TT_LITERAL
- TT_NUMBER
- TT_NAME
- TT_PUNCTUATION
So the parser actually has to perform much more than in a "standard" compiler pipeline.
At startup the idCompiler load the first script script/doom_main.script
, a serie of #include
will build a stack of scripts that are combined in one giant one.
The Parser seems to be a standard recursive descent top down parser. The scripting language grammar seems to be LL(1) necessitating 0 backtrack (even though the Lexer has the capability to "unread" up to one token). If you ever got a chance of reading the dragon book you will not be lost...otherwise this is a good reason to get started ;) !
Interpreter
At runtime, events trigger the creation of idThread
that are not Operating System threads but Virtual Machine threads. They are given some runtime by the CPU. Each idThread
has an idInterpreter
that keeps track of the Intruction Pointer and the two stacks (one for the data/parameters and one to keep track of the function calls).
Execution occurs in idInterpreter::Execute
until the interpreter relinquish control of the Virtual Machine: This is collaborative multi-tasking.
idThread::Execute bool idInterpreter::Execute(void) { doneProcessing = false; while( !doneProcessing && !threadDying ) { instructionPointer++; st = &gameLocal.program.GetStatement( instructionPointer ); //op is an unsigned short, the VM can have 65,535 opcodes switch( st->op ) { . . . } } }
Once the idInterpreter
relinquish control the next idThread::Execute
method is called until no more thread need execution time. The overal architecture reminded me a lot of Another World VM design.
Trivia : The bytecode is never converted to x86 instructions since it was not meant to be heavily used. But in the end too much was done via scripting and Doom3 would probably have benefited immensely from a JIT x86 converted just like Quake3 had.
Recommended readings
Great way to understand more about the virtual machine is to read the classic Compilers: Principles, Techniques, and Tools :