An Introduction to AI in Games from Phil Carlisle
Add comment!January 24th, 2010
This is a guest post from Phil Carlisle, best known for his work on the Worms series. You might remember his post from a couple of weeks ago, here. His last post was very well received, and he agreed to write a follow up! Read on below.
Last time, I talked about a number of potential behavioural artifacts that I would like to see in Overgrowth. While discussing it on IRC, Jeff asked me to write a post about how the things I talked about might actually be created. Here is the result.
The basics
Every AI has a set of basic requirements. In the main, these can be broken down into a system for movement (often called pathfinding and/or locomotion) and a system for controlling agent logic. In general the AAA part of the games industry has settled on the use of a technique called navigation meshes, which basically offers a simplified version of the 3D world against which an algorithm called an A* search (or some variation thereof) can be used to determine potential movement space. A really nice explanation of WHY we've settled on navigation meshes can be seen in this article on Game / AI.
This is often supplemented with local dynamic obstacle avoidance of some sort, the most prominent being some kind of "steering behavior" based on work by Craig Reynolds. Although, a technique called velocity obstacles is gaining more interest, especially for crowd movements.
Luckily, these basics are actually the easiest to integrate, thanks to the work of Mikko Mononen, ex-Crytek AI programmer who has recently released libraries called Recast and Detour, which provide functionality present in many middleware AI libraries but as "free for commercial use" open source.
Given the pathfinding basics being provided by Mikko, the next part of the basics to cover, is to control the AI logic. There are three typical approaches currently used in games. Finite state machines are still used within the industry, however these are mostly now hierarchically based HFSM and have been superceded in many cases by the next two approaches. The first is the use of planners, pioneered by the use of a "strips-like" planner in the monolith title FEAR implemented by Jeff Orkin and recently used in Demigod by Gas Powered Games. The second approach to emerge, is more of a reactive approach called behavior trees. BT's have been used in many titles recently, the Halo franchise being the prime proponent for that approach (see Damian Isla's talks at GDC for more info). A great discussion on BT's can be found on aigamedev.com.
Often, the BT approach is enhanced with the use of a "blackboard" which is essentially a repository of data stored about what the agent knows and/or feels. This is closely linked to a sensory system that provides the agent with knowledge of its environment. Typically via the use of spatial queries (for example "how many agents are within this range") or via raycast queries (for example "if I follow this line, can I actually see anyone?"). A lot of time is involved in this sensory processing and AI programmers tend to spend a lot of effort trying to ensure that this system is well optimised.
So for the basics, you have the tools for movement and navigation in space and logic to control action selection and monitoring. With those you could make a game that works. But it definitely wouldn't be a very exciting game. So the next system you need to build, is some method of animating the character. With modern games, you'll see characters that can vault over small obstacles. Jump up to ledges. Jump across ravine's etc. To do this, you need to add an animation system to the basics of movment and logic. The animation is what brings the character to life. Animation typically takes the form of a set of rotational movements of a set of "bones", each of which controls a particular part of the body. You can see this in action with Overgrowths animation editor. Each bone stores a rotational value and this is then "keyframed" into a number that changes over time. There is another type of animation that is worth noting and it was popularised by the terminator films, specifically the film where the second terminator made of liquid metal changed shape. This is what is known as "morph" animation. I mention it here because morph animation can still be useful for facial animation, which we'll get to shortly.
In order to control the bone based animation, many AAA developers used what is called motion capture to generate data for the rotations of the bones. You can see a really fine example of this kind of technology here: avatar motion capture video. The advantage of motion capture is that it genuinely captures real human performance (i.e. its not done by hand by an animator). The downside to motion capture, is that it is incredibly expensive and takes up a lot of memory and disk space. An alternative to motion capture data and one that is being show in the rabbot character in Overgrowth, is the use of mathematical formula's for movement. This is known as procedural animation and has been around since the early days of computer animation. People like Ken Perlin have pioneered many interesting aspects of procedural animation.
These days, it is not unusual to see elements of both types of animation being used together. I suspect that more procedural and physics based animation will pop up in games as the limitations of capturing and storing enough motion capture data hit developers (the sheer volume of data needed can be staggering and if you are a console developer it can be a real problem).
The more advanced elements
So given the basics of movement and action selection via logic. Plus the expression available via animation, we get onto the really new and fun stuff.
Emotion
When you watch a great peice of film, you experience it as a whole, you feel its performance and empathize with its characters. But what makes you feel that way? This is the big question that many game developers are struggling with right now. How can we create characters that are better than before? By actually allowing them to HAVE character!
When we interact with other people, we spend a huge amount of our time using our bodies to send non-verbal signals to whoever we are communicating with. It might be we nod to let them know that we have understood a point they have made, or we might shake our head to disagree or slump our shoulders and sigh to signify our exasperation. Similarly when we move we literally change the very strides we make as we show signs of fatigue, or we might move in a more confident manner after we have achieved a personal goal. This is very much the modern AI battleground. As AI programmers we want to create experiences that really immerse the player, engage them with the narrative and characters in the world, or just challenge them in a natural way by using other characters. This is almost entirely achieved through changes in posture, gesture, proximity, kinesics (touch), gaze etc. If you look at most moden AAA games, you will most likely see that games have started to tackle that last one first. Most games these days incorporate gaze (the direction of focus of the eyes) by procedurally animating the direction of the head to look at important things.
Gaze is only a tiny part of our human communication repertoire, yet it serves to control an awful lot of vital information to other humans. With control of gaze, we can make characters seem bashful, flighty, thoughtful, ignorant, vacant, interested, seductive etc. The addition of control over eye motions, eyelids and eyebrows as well as the overall direction of the head can really bring a character to life. But gaze is just the beginning. Now we are starting to look at these other aspects of human motion and incorporating them into our characters via the animation system. Luckily, there are plenty of psychologists, anthropologists, animators and social scientists who have been investigating these things for decades. People like Adam Kendon, Paul Eckman, Michael Argyle and a host of others have built up a solid body of academic work that we can use to inform improvements in these external expressions of emotion.
And that brings me to my final point. The actual emotions themselves. This might feel counter-intuitive because most of us think that artificial intelligence is all about logic. But in real humans, there is not just rational (i.e. logical) thought, but also emotion. We have grown over hundreds of thousands of years to use emotions as a regulartory system that feeds back responses from the body to the brain in order to control its processing. What this means, is that we should really think of the brain less as an individual bodily organ and more as a part of the body. The whole concept of a human existing as a brain in a jar is wrong, because without the body we simply would not be able to feel anything. Literally we would be no longer ourselves because our whole "self" is our body and our brain together. This is important, because a lot of the changes we see externally (shifts in posture, gesture, gaze etc) are physical manifestations of the emotional parts of the brain or are greatly colored by them. I highly recommend reading "Descartes error" by Antonio Damasio (professor of neuroscience) to get a feel for how important emotions are to our daily lives. The main academic work on emotions for computing is by Ortony, Chlore and Colins and offers what is called the OCC model. Most academic work enhances this model with a personality model known as the OCEAN model or "big five".
I recommend if you're interested in this aspect of AI, to check out my article in game gems 8, which should be out at the GDC this year (or catch up with me at the GDC).
Conclusion
So right now, there are a lot of people looking at this whole notion of emotion and expression, because it offers us a chance to create dramatically interesting human-like characters. Luckily we've learnt that they need not be humans, or even be "realistic" to be human like. Disney and others have taught us that these representations can be quite abstract. Personally I find it fascinating to work in this field, because it offers me the chance to create virtual "actors" and eventually I think we'll figure out an interface that lets anyone create really compelling AI character performances that will allow us to create new types of games.
Are there any other aspects of AI you'd like to hear about from me?