AI and Games is a YouTube series made possible thanks to crowdfunding on Patreon. Support the show to have your name in video credits, vote for future episode topics, get early access and exclusive patron-only videos.
DOOM is the archetype of the first-person shooter genre and helped define an entire generation of games in the 1990s. While not the first of its kind, nor even the first by developers id Software, it was a game that changed the landscape of the industry. And it's an important topic to explore not just because of its popularity, but because - much like other games from the 90s we've explored to date - it achieved so much using limited hardware and without many of the tools and standards game developers and players now take for granted.
DOOM is, in essence, a dungeon crawler with guns, in which players must navigate a myriad of complex non-linear environments to find keys and activate switches that will move them one step closer to the exit. Along the way, players fight a myriad of hellish and demonic creatures, ranging from possessed gun-wielding soldiers to fireball tossing imps, flying enemies such as the cacodemon and lost souls and the brute strength of the pinky Demon and Baron of Hell. The sequel DOOM II, released in 1994 added even more variation, with Hell Knights, Revenants, Arachnotrons, the Mancubues, Pain Elementals and the Archvile. All of these characters move through the world, respond to player behaviour and attack using their own internal logic. Perhaps the most impressive aspect is the sheer number of enemies that the game can accommodate at once, all the while handling pathfinding, line of sight tests, collision checks, interactions and more.
But DOOM was no slouch. The game capitalised on the huge changes happening in the home PC market of the period. Hardware was becoming more affordable and the landscape had shifted in a way that benefitted game development. Intel CPUs, notably the 486 from 1989 was becoming increasingly affordable, with the 486DX2 released in 1992 proving even more powerful. RAM was becoming cheaper and the move toward Windows 3.1 had pushed graphics adapters in chipsets to become more commonplace. This change in the market led to id software throwing out pretty much everything they'd built for their previous game: Wolfenstein 3D in 1992 and as we'll see, there are numerous optimisations to the code that allows for it to achieve things that similar games of the era would fail to replicate.
DOOM's enemy AI is but one class of object in the game known as a 'thinker': these are any objects in the game that have to make a decision that typically will process or evolve over a period of multiple frames. The original DOS build of the game runs at 35 frames per second, with each frame handling the logic for all four main classes of thinker: the first class is all of the actors: meaning the player, non-player characters and the level itself. Secondly, there's the status bar on the bottom of the screen, then the in-game map and lastly the rendering of the HUD itself.
The actual core logic of the enemies is a Finite State Machines: a simple but effective mechanism to state that a character executes a specific behaviour when in a given state, and what the conditions are that will force it to change. While released several years later, Valve's Half-Life popularised FSMs for modern shooters courtesy of the nature of its C++ codebase, while also providing mechanisms for characters to have goals they can try to satisfy. DOOM operates in a similar structure, but it lacks a lot of the same flourishes, but also some of the more seamless integration. The big reason for this is that there's no scripting language in the DOOM engine that allows designers to configure the NPCs to do what they want. As we'll see shortly, the actual logic is all written in the C programming language, and there are a lot of very clever tweaks implemented to allow for the variety of different enemy designs not to mention all the encounters players face in the game.
The full diagram is shown here now, and there are some interesting quirks in how it all works, so let's discuss the high-level behaviour and then I will burrow down into individual topics.
Enemies always start in the SPAWN state: a mechanism whereby they are idle and await an event that will force them to go into a more active behaviour. They're waiting to see or hear something that will force them to act, and will simply walk on the spot until then. Players of DOOM will have spotted this on occasion, though the most obvious instance is in Entryway: the opening level of DOOM II where the zombieman have their back to you.
Upon receiving a sensory input to tell them the enemy is nearby, they go into the 'See' state, which allows them to move around the map and head towards their designated target. We'll go into more detail shortly on how all of that works. Provided the situation will allow it, an enemy can then attempt to attack the player, resulting in them moving into the Melee or Range states. Reverting back to the See state once the attack is complete.
Meanwhile, the monster can only enter the pain state externally: which makes sense given it should only happen when the player hurts it. Otherwise, that would mean the enemies are essentially hurting themselves I guess. It then plays the corresponding frames for rendering, and any sound effects and gets back to the core logic. As we'll see in a second, NPCs don't go into the pain state every time they take damage, but instead, it's based on a probability that is encoded in the enemy's design.
Lastly, a monster can die in one of two ways. There is the simple DIE state, brought on by the monster receiving a total amount of damage that exceeds its starting health. However, there's a special case called XDIE, which is used when a monster is gibbed and turns into a pile of goop. Once again these states can only be transitioned into by external logic, rather than the monster itself deciding it should turn into freakin' confetti. The gib will only occur if the damage received at the time of death exceeds the remaining health of the monster, plus its original starting health.
So for example, Imps start with 60 health points. If it's been shot already and is down to 10 health, then in order to be gibbed, it has to receive a minimum of 71 damage on the next attack. Given the lower starting health values of the likes of the zombieman and shotgun guy (who have 20 and 30 starting health respectively), it's relatively easy to gib them with an exploding barrel, the rocket launcher or even the berserk pack. Meanwhile, enemies such as the Cacodemon or Hell Knight which have 400 and 500 health at start are practically impossible to gib. That said, you can gib pretty much any monster if you can telefrag them (i.e. by teleporting into them, given that does 10,000 damage). Such as in Perfect Hatred, the second mission of DOOM's Thy Flesh Consumed episode where you can telefrag the Cyberdemon given it only has a meagre 4000 health.
You'll notice there's one last state: RAISE. This was introduced in DOOM II courtesy of the Archvile, the demonic priest that can resurrect all enemies with exception of the Lost Souls, the Cyberdemon, the Spider Mastermind and other Archviles. As we'll see in a minute, this actually results in some weird behaviour during its actual execution. So let's get into that next: each enemy adheres to this state machine, but their individual behaviours are quite distinct. So how are they defined and how do they work?
To execute the logic, any actor such as an enemy is spawned in with a thinker object attached to them. Each thinker is a struct written in the C programming language that has inside it a pointer to a function of game logic the enemy must execute, alongside a range of additional variables that help it take shape. This function helps each monster to determine the particular behaviour it should exhibit, based on its current state.
But the actual definition of each enemy in DOOM is actually achieved outside of the codebase. A text file from the project is imported and compiled into an 'info' file, which contains within it all of the enemy's specifications as well as the definitions for all states. The information from the text file is compiled into a struct that carries the following elements:
- Their internal ID number in the game.
- The amount of health an imp starts with.
- Their movement speed.
- Probability of pain state interrupts
- Their radius and height
- The specific properties of the NPC.
In the case of imps, they are solid objects you can't walk through (MF_SOLID), they can be shot (MF_SHOOTABLE) and damaged and their death counts towards the total number of enemies in the level (MF_COUNTKILL).
As mentioned previously, monsters do not always go into the pain state when hurt. Instead, it is influenced by the pain state interrupt probability which in the codebase is referred to as pain chance. Every time an enemy is hurt by a weapon, there is a probability check to decide whether the enemy will go into the PAIN state, thereby interrupting its movement and any attacks. The number ranges from 0 to 256 and is configured for each enemy type. So in the case of the imp and shotgun guy, these are 200 and 170 respectively. Meaning that the imp, anytime it is hurt has a 79.3% chance of being put in the PAIN state, while the shotgun guy only has a 67.6% chance. But the trick is that each individual hit from a weapon counts towards this. So while a rocket counts as only one hit, while the shotgun technically has seven separate attacks courtesy of its spread. This is why the minigun is such a useful weapon against the likes of the Cacodemon and Arachnotron, because each monster has a pain chance of 128 (or 50%) but each consecutive bullet fired is another opportunity to trigger the pain state (or, as it was referred to in the original DOOM manual, making them "do the chaingun cha-cha"). This has a huge impact on weapon selection during gameplay - to a point that it's so intrinsic to the gunplay of DOOM that it was reimplemented in the 2016 reboot. Some enemies have incredibly low pain chances - with the Cyberdemon and Archvile both in single digits (5.5% and 3.1%), while others like the lost souls have a 100% pain chance. Also technically the barrels also have a pain chance, but it's set to 0 - which y'know, makes sense because they're barrels.
On top of all of the information about the behaviour of the enemy itself, there are also individual action definitions for each state. These are custom definitions for each of the states in the state machine. So we can see that the imp uses the attack action for both the melee and ranged FSM states. Meanwhile the shotgun guy doesn't have a state defined for melee attacks.
Now as for the actual actions themselves, these are actually a lot more complicated. Each state is compiled down into a definition for each individual action frame that is executed in that state. So if we consider the attack for the imp, it blows up into even more information. This includes:
- The family of sprites that are read from when rendering this behaviour and the frames to be used.
- The number of in-game tics that frame executes for.
- The in-game action in the source code that it should execute on that action frame.
- The next action frame it should then execute.
So in the case of the imp attacking (see above) it will...
- Render the first two sprites for 8 frames each, and run the code that ensures the imp turns to face the player.
- Then draw the third frame for 6 frames and run the special TroopAttack function, which is designed specifically for the Imp.
- Once it has finished the third action frame, it will transition back to the main 'see' state for the imp.
And when you look at the code for TroopAttack, it highlights why there isn't a separate state for melee versus ranged attacks, given the Imp will check if it's in melee range and then commit to that, or it will simply launch a missle at the player. These rules are in place for things such as enemy characters attacking the player, dying or being gibbed after taking damage and animated sprites in the environment. The fun part, is returning back to the RAISE state mentioned earlier, any enemy that can be resurrected by the archvile has a set of RAISE action definitions, which have the exact same set of sprite frames used for dying, but it plays them in reverse.
One last thing is that for each of these action definitions, we need a separate frame of animation, but that hides a further complexity. Given enemies in DOOM can be rendered from 8 different angles relative to the player's position. So when one of these sprite families is being used, it always pulls the appropriate set of sprites for this character based on their angle relative to the player. So in the case of the Imp: there are 21 unique sprite indices. Meaning you have 168 individual sprites to cover all 8 angles. However, one optimisation to this, is that some characters - such as the imp - are symmetrical. So instead of having 8 sets of sprites, there are only 5 and if a sprite set for that orientation is missing, it simply takes the mirror equivalent. Sadly some characters such as the Cyberdemon, are not symmetrical and so they still had to build each individual sprite.
From Fabian Sanglard's 'Game Engine Black Book: DOOM'
Now that we know the overall core logic for the enemies in DOOM, next up I want to focus on the SEE state. Which is when an enemy has found a target and then locks in on it. Chasing them around the map and then attacking when they can.
As mentioned already, the SPAWN state has the enemy standing around and waiting for something to trigger them. This can happen in one of two ways: either they see an enemy, or they hear them. In either case, it establishes a target for the NPC to head towards. There are some extra checks here as to whether the target is shootable, meaning they don't become skittish and hunt down anything that makes a noise. But also, there's... actually, no I'll tell you the other secret in a minute.
Once the target is established, it moves into the main SEE state from the diagram. This corresponds to a chase function in the codebase, whereby the enemy will begin the hunt and if it's within range for a melee or ranged attack, it will take a shot.
But wait, let's back up: how do the enemies see or hear you? This is actually a huge chunk of logic for DOOM given it's tied to how the map is designed. Plus, it's also one of the most interesting aspects of how the game is optimised. In previous videos on games such as The Last of Us, Splinter Cell and Alien: Isolation, we talked about how vision cones are used for enemy vision. DOOM predates all that stuff, but also it's designed for early 90s PCs and quite often there are dozens of enemies running sight checks at once. Even if all the enemies are in the SPAWN state, then each of them is calling the Look() function in the codebase 35 times a second. So not only does it need to work effectively, it also needs to be optimised.
DOOM enemies technically have 180 degrees of vision and have no long-distance vision cut off. If the view between yourself and the enemy is not obscured, they will see you. Plus, height differential doesn't matter. That's because DOOM is a 3D render projected from a two-dimensional floor plan. The line of sight is parallel with the floor, hence you'll notice if you're above or below an enemy, if it's facing you then it will always notice you even if sometimes you can't see them in return.
So how does it ensure that an enemy can see you within its field of view and also navigate towards you in a way that's cost effective? The trick here is that DOOM's map is broken up into chunks using what is known as sectors, and those sectors are organised courtesy of a binary space paritioning algorithm or BSP. A BSP allows you to organise objects such that they retain spatial information. This is used so that the renderer always knows what parts of the game map it needs to draw, by only rendering the current sector and any connecting sectors according to the BSP. This optimisation was added by programmer John Carmack to cut down rendering costs, given it saves on performance by ensuring it doesn't try to render the entire level - an issue that modern game engines have now rectified. Now from this, some further optimisations can be made to the AI as well.
The map being broken up into sectors means that, much like the BSP, you can record information about whether it's possible for one character to see another based on their location. Visibility between sectors is actually precomputed in DOOM courtesy of what is known as the REJECT table. This tells us that if a character is in a given sector, whether they could even potentially see another character in another sector. Like in this example here, it's practically impossible for the character in sector A to see anyone in sectors C and D, but it could in theory see you in sector B. So when they're looking for the player, the game quickly does a check against the REJECT table to see whether it should even bother running the sight test.
Once the player has been seen and a target is acquired, then the monster will start moving toward you. Technically, there isn't any actual pathfinding in DOOM. If there is a direct path towards the player, then it simply moves towards you. If there isn't, then it tries to head in that direction and will bounce off walls changing direction as it goes. In the event there's no possible direct path to you, then it will move around randomly instead. In order to manage all of the enemies (and the player) colliding with the level, all of the collision data is precomputed as well. The BLOCKMAP breaks up the map of the level into a grid that allows it to quickly check if that space is navigable.
So yeah, if an enemy sees you it's going to kick-off and the situation will get a little... tense. But what about when you pop into a room, fire off the shotgun and everything roars in anger at you.
Well, here's the funny thing. Yes, the enemies in DOOM can hear you, but not in the way you'd think. I mean, one could argue that the enemies actually SEE sound. I mean it literally sets a sound target in the Look() function after all.
When the player fires their weapon, it sets the player as a 'sound target' in the sector of the map that you're in. If an enemy is in a sector where a sound target is created, it will wake it up and hunt the sound target. But that would only work in one sector unless the sound could travel. DOOM has a very simple sound propagation, whereby a flood fill algorithm spreads the sound around the map provided there are no blockers that prevent the sound from dissipating further. A good example of this is in (E2M4) Deimos Lab, given the entrance opens out to a long corridor at the side. As soon as you finish fighting the imps and shotgun soldiers at the entrance, you'll soon be attacked by imps who heard all the commotion hundreds of meters away, given the gunfire propagated through the space.
Doors between level chunks will block the sound from travelling further, but also level designers could place blockers where they wanted to make sure the sound didn't travel into areas where an ambush was being prepared (more on that in a minute).
One of the most memorable aspects of DOOM is that the monsters have a habit of infighting. They'll tear chunks out of each other if the mood strikes them. How on earth does that work? Well, it's actually quite simple: there's logic in the code that states in the event an NPC is attacked and hurt by another character, then it might assign that character as their new target. Even if it's not the player. Naturally, this means that the demons don't attack one another without cause. So you need to lure one enemy into the line of fire of another, and hope it accidentally causes some friendly fire.
There are some exceptions to this, given Barons and Hell Knights can't hurt each other with their attacks. Pain elementals technically can't get caught in infighting because they hurt other characters by spewing Lost Souls (which will then get targeted instead). And there's a specific edge case coded into the game to stop Arch-Vile's being attacked by other NPCs. This means you need to go and smash his skull in yourself... typical.
So we've explained the core logic, how individual state actions are defined and the optimisations for pathfinding, visibility and sound checks. But that isn't enough to make DOOM behave the way it works. The game is full of traps, of dead ends and much more that help create these rich and interesting encounters. The trick to this is two critical parts of the game's design.
First of all, the game has a variety of different custom actions that can be synced up with the map. If you've ever cracked open a DOOM map you'll have noticed a lot of lines that hide inside the map. Some of this is the actual geometry, while others are essentially trigger volumes. Meaning the game can detect when the player crosses into that volume (it's a very common practice in modern game engines). One trick DOOM heavily employs is the level editor allows for code snippets to be executed upon entering a volume, and it can also add a tag that references what parts of the map should be affected. Hence you can have a hidden door somewhere around, stick some enemies in there, all of whom are in the SPAWN state and can't hear any noise you make because the secret door prevents the audio propagation and you're good to go.
However, each trigger can only have one function pointing to one object. So what if you need to have two things happen at once? A great example of this is the blue key trap in the Toxin Refinery (that's E1M3 btw). The trap involves walking in to grab the key, only for the lights to dim and a secret door with imps inside to open. Well, the solution is you just place two triggers right next to each other targeting different objects.
There are over 130 of these functions in the game, ranging from opening and closing doors, raising or lowering the floor, locking a door, changing the light levels, teleporting the player and more. Using this you can simply set up an environment, and once the walls move and the enemies can see the player, or hear a gunshot nearby, then we're good to go! But sometimes, it leads to some even more creative choices.
A really good example of this is in the Military Base, the secret level in the first episode of DOOM. In which the player can trigger a trap by grabbing the rocket launcher sitting atop the pentagram on the ground. When the player crosses over the pentagram, it doesn't change the room the player is in, but rather, it removes a small wall that has been protecting a teleporter in a small room locked off at the side.
The enemies in the room are already active and moving around but they need to wander into the teleporter in order for them to appear on the pentagram in front of the player. The trick is that there is a small and largely imperceptible corridor of level that connects the main room with the pentagram to this tiny room on the side. This is used so that any noises that occurs in that room will propagate through to the hidden room and encourage the monsters to head towards the noise, ultimately walking into the teleporter. This 'sound pipe' is actually a fairly common tactic and crops up on other occasions throughout the WAD files of both DOOM and DOOM II.
But that isn't the only trick that helps make it all come together. The second trick, is a little special customisation that happens on a per-demon basis when the levels in DOOM are designed. There is a special Ambush flag that can be set on a given monster. MF_AMBUSH sets an additional logic to the previously mentioned SEE state. As mentioned previously, if a demon can see the player or hears the gunshot, then they treat that location as a target and move towards it. MF_AMBUSH changes this logic so that even if a demon hears a noise that alerts it, it doesn't act on this knowledge until it receives visual confirmation. So the demon is awake and active, but it stays rooted to the spot.
This means that during an encounter, some demons will still be able to catch the player off guard and - as their name implies - create an ambush. It's a simple yet highly effective trick that combined with the level tags and sound pipes, can lead to all sorts of interesting combinations!
A great example of this is Containment Area: the second mission of The Shores of Hell. While getting into gunfights often results in kiting the odd zombieman or imp toward you, many of the imps in that level are using the Ambush flag, meaning that even when you kill someone right next to them, they still wait until they achieve a line of sight before they trip to rip your face off.
It's crazy to think that DOOM is now approaching 30 years old. But it's a game that continues to influence and impact the industry in a myriad of ways. And it keeps getting ported to pretty much every device imaginable.
One of the big reasons that DOOM is able to appear on everything from PCs to consoles, mobile phones, smartwatches and even ATMs is because the source code of DOOM has been public knowledge for now for over 20 years. As we've discussed, the game is highly optimised C code, and id Software put the code online in 1997. If you're keen to learn more, check out the links to the source alongside Fabien Sanglard's Game Engine Black Book on the subject and the DOOM wiki.