Build a home, build a community, build HOPE. Hope is a first person, world and community building roleplaying game. Set in the junkpunk world of EverSky, where people live on flying structures known as "rigs". Your role is to help build the community on a rig called "Hope", using a wide variety of tools. You will build, enhance and maintain the rig, whilst trading, crafting and socializing with the rigs inhabitants.
Working on a new language to implement the character AI for the game.
Posted by zoombapup on May 5th, 2013
While I've been working on trade and tradeship movement this past week or so, I've also started to work on the design of a language I'm calling DABL (which stands for digital actor behavior language). DABL is going to be the tool I use for specifying the behavior for all the AI characters you'll spend time with in the game. The reason I'm working on DABL is to allow me to experiment a little on ways to speed up my productivity as an AI character programmer. DABL will be a "structured language", what we call in AI circles a DSL (which is short for domain specific language). I'm not new to DSL's for AI, having worked on Storybricks (www.storybricks.com) which was an AI language based around MIT's language SCRATCH. Scratch was a block based language, which is great for teaching as it enforces structure. Structured languages are useful in this context because they allow you to more easily constrain syntax issues. Essentially you can't get the language wrong because the tools will not let you, hence the term "structured".
There are other examples of AI languages used in games. Apart from Storybricks, there's also been Edith, which was a language used in the earlier sims games. But the language that influences me most right now is one that Valve created as part of an experiment to create interactive scenes based on earlier work they did in Team Fortress 2 and Left 4 Dead. I'm also inspired by BML which I'll get to in a second.
What does DABL do?
DABL is a language designed to control and coordinate expressive digital actors. It specifies the conditions under which a specific interaction can take place, it specifies the interaction itself and finally it specifies the outcome of the interaction, all this while taking into account emotion and expression. It incorporates videogame specific features like proximity, actor animation, posture, gesture and resource availability.
It does this by creating a language that incorporates things like proximity constraints (how close an actor has to be) as a first-class feature of the language. In essence it describes what you might typically see in a cut-scene, but it does it on-the-fly and in an opportunistic manner. Essentially the goal is to be able to describe as a designer using the DABL language a large number of small "scenes" that can then be played out when the correct circumstances arise.
The key is that scenes can string together to form quite complex chains of interactions, all the time having a simple way to specify expressive behavior that reacts according to the feelings of the actors involved. I'm still working on the GRAMMAR for the language itself, but I see there being a few key factors that the language requires:
Almost all AI in videogames boils down to "if (condition) then (action)" clauses. The conditions for any given behavior can be quite complex, things like proximity (how close I am to something), facing (and I facing towards it?), ownership (is this thing mine?) are relatively simple. But imagine you have to consider many thousands of possible conditions at any point in time for any agent? Things can slow down pretty quickly. There's an interesting feature in Cryengine that Matthew Jack worked on, which was a way to query the world and then apply a "query language" to the results of that in order to filter out any irrelevant potential choices very quickly. This idea of using query languages to filter results is one of the cornerstones of working with large sets of data and underpins most relational databases for instance.
One of the interesting aspects of looking at that space is that query optimization happens a lot in these languages. I suspect that I'll be spending a lot of time working on query optimization for conditional checks a lot over the next few months.
Resources are essentially things that can be manipulated, they effectively represent the world state data. They also have the notion of ownership associated with them. In the simplest case, a resource could be thought of like a variable. That way conditions can be used to check the availability of a resource, with the subsequent behavior performed if that condition is met. I'm thinking of resources more as a set of "things", with the set being zero or more things, such that we can do operations on the set. For instance a resource set called "Everyone" could be split into "Friendly" and "Enemy" subsets. Each Resource has a bunch of properties which can be queried and used in conditions. In essence I think of the resource as a typeless data store, I guess this works in line with other typeless languages. It might be that having typeless data is a bad choice in the final language, but for my first iteration I think I'll err more on the side of designer expressiveness over structure.
Actions are the atomic operations that most AI engines deal with. Things like moving actors to a given position, transferring ownership of resources etc. These are relatively easy to code, but I'm extending the actions using "expressions".
Now all the above are pretty typical in any videogame AI setup. They are the building blocks of the "sense-think-act" cycle. But we need more in the language if it is going to be able to generate expressive reactions for our "actors".
But before I get to that, here's a video I made from a project I started during Ludum Dare 26. I wanted to evaluate Unity in a more useful way, so I decided to do a little bit of tinkering during Ludum Dare.
So back to AI, well, the next bit is one of the more interesting aspects. That of turning "behavior" into "expression\".
Goals are expected outcomes, they define what we expect to see happen during a given interaction. Imagine if you were buying something and you handed over some money only to have the vendor short change you. The goal in this case would be to receive the correct change. The interaction between goal and the next category is what makes this system really work.
Appraisal isn't something you'll see mentioned in a lot of videogame AI, but is really important for expressive and emotional characters. Appraisal is the thing you do when you compare what actually happened, with what you thought would happen. It is important because it allows us to react in different ways depending on how we perceive the outcomes. So in the short change case, we might express puzzlement if we appraise the dissatisfaction of the goal "receive correct change" as an error on the vendors part. We might also incorporate a "trust" appraisal value on the part of the vendor and change our expression to indignation if we thought the short changing was done on purpose. Without goals and the appraisals of them versus the reality of the situation, we could never correctly express ourselves.
The key to all this is being able to express something in the first place. I mentioned in another feature about using Ari Shappiro's SmartBody system to allow us to drive characters that can do things like look at objects, grasp objects, shift posture, generate gestures etc. Here I'm incorporating elements of BML (behavior markup language) as a subset of expressions. BML is a really powerful low level system (conceptually, in reality its simply a bunch of XML based tags), but I think that videogames need a higher level construct that allows a number of BML and other commands to be expressed as a whole. So instead of specifying each individual element of an expression, we can collapse that down to a named expression like "surprise". You can think of it like having a function call in any other language. The function can be an arbitrarily complex sequence of computation, so the expression can be a complex set of smaller expressions, even down to individual eyebrow raises.
Semantics in DABL are a useful way of expressing shorthand for types of objects. They allow "things" to be classified as semantically related. So for instance if a character needs food, we might search for objects nearby and characterize each one with its relationship to food. So if we've specified two objects APPLE and ROCK and given the APPLE definition a semantic relationship to FOOD then we can react according to the APPLE when appraising the objects. This semantic relationship is important because it lets you write high-level expressions to "types" of objects, such as all FOOD objects. This is useful because you can shorthand a lot of interactions if you can filter objects into semantic subsets.
One of the key features I envision for speeding up behavior description (which is what as an AI designer I want to be doing) is the idea of relationships. I think that being able to have inheritance-based relationships and semantics-based relationships will make the language far more efficient. A use case here would be the typical scenario from ROMEO and JULIET. In this case, the group "MONTAGUE" has a HATES relationship with the group "CAPULET", so we can have a bunch of reactions defined at the MONTAGUE level. We can then create a new "class" of object called ROMEO that EXTENDS the MONTAGUE class, gaining all its default behavior. But then add a "LOVES" relationship to the class "JULIET" with its own set of behaviors. I have a vision of how this will work, with coloured sets of objects/characters and overlapping colours in a sort of layout graph format. But I might try a basic language version first.
How does it all work?
One thing I learned when working on Storybricks, is that block-based languages are good for beginners, in that they allow for lots of help in constructing the correct syntax, but they are actually too slow for experienced programmers or designers. So I'm working on the notion that an intellisense-style autocompletion is actually a far better fit for a language intended for experienced programmer/designer types. This intellisense approach makes a lot of sense because it still constrains syntax (by specifying only allowed keywords/symbols) whilst not breaking the flow of thought. I'd tinkered with the idea of using another block-based language or a flow graph, but in the end I think an example implementation will come together a lot faster given the availability of auto-completion frameworks. I'm also swayed by the experiences of the valve guys when they say that designers were ok with text-based "language".
So what next?
I'm still defining the grammar for the various language parts. This is going to be a BNF grammar which I should then be able to create a suitable parser/abstract syntax tree and then generate the data for the auto-completion to work on. I'm likely going to start doing this in C# so that I can prototype it quickly, plus I might port it all to use in unity at some point. Either way, I'll release an experimental version before too long.