NB: This is a version of an old article I wrote on conversation in interactive fiction. While much of it is still applicable, I have written numerous other posts on the topic since, which expand on the overview provided here and suggest some more particulars about implementation.
Conversation is one of the most challenging things to code in interactive fiction, and also one of the most widely discussed. There are a number of issues: how will the player communicate with the game what he wants to say to the NPC? How will information be represented internally? How will mood and context be represented within the work? To what extent will the NPC control the flow of discussion?
There is no single right answer to these questions. Conversation design depends very much on the author’s intentions for a work.
Questions to Start With
The Purpose of the Non-Player Character
Before you code anything, you should consider what kind of game you are writing, and what purpose you have for the non-player characters (hereafter “NPC”) who will appear in it. A game with a strong emphasis on puzzles, where NPCs are present only to provide another kind of challenge, will have a very different treatment from a linear, story-oriented game where NPC interaction is the chief purpose of the game. A mystery with a lot of knowledge puzzles will again have a different set of requirements from a romance, where emotional interaction is emphasized.
Even within a game, different NPCs can have different roles from a game-play perspective. One of the best discussions of this problem I’ve seen is Jim Fisher’s article on Ask/Tell theory: while it claims to be about a specific kind of conversation system, it addresses some important points about how NPCs can be used in any IF game. Do you want your player to talk extensively with NPCs and have a great deal of flexibility in the outcome? Or would you prefer to have control over how each conversation goes? Are the NPCs there mainly for local color, or do they provide vital exposition? Do they have to accomplish anything in the story?
The answers to these questions will affect the rest of your NPC design. In a game with a highly predetermined plot — that is, one in which you could sit down and write a list of all the scenes that must occur — a simulationist approach to NPC design would probably be the wrong approach. That is, you will probably have less use for variables to track emotion and behavior in general, and you will accomplish more of your effects by scripting them specifically. By contrast, in a game where the play is very broad, and the player can spend as much or little time with the NPCs as he would like, you may find yourself writing a fair amount of generic code to cover issues like NPC attitude, behavior, and goal-seeking.
Input and the Parser
Conversation is probably the most difficult thing to code for an NPC. There are several problems, but one of the most pressing is that we can’t yet express conversation in interactive fiction the way we express it in real life. The most natural thing to do, from the player’s point of view, would be to type exactly what he would like to say, using plain English (or plain Italian, or whatever) to convey both content and tone. The Ideal NPC would understand perfectly and would react to the player’s attitude as well as the factual content of what was being said.
At the moment, however, this kind of parsing is beyond us. Moreover, while it might be an ideal solution from the player’s perspective, it could only be a nightmare for the author. In other aspects of IF design, it’s possible to limit the number of things a player could reasonably do: there are only so many verbs, and only a specific set of objects available, and the scope of action is fairly well understood. An NPC who understood all topics of conversation, and all kinds of tonality and mood, could never be exhaustively programmed; there would always be another quirk unaccounted for. And such a character would almost certainly exceed the boundaries of the intended plot, as well. It would be quite hard to write a story that had any sense of structure or continuity, if the intended plot could be set aside while the player taught the NPC the basic rules of cricket.
Here are some of the forms of input that authors have explored so far:
Yes/No Conversation. This is a one-trick pony: the NPC asks questions and the player is allowed to answer “yes” or “no.” Andrew Plotkin did a brilliant job of handling it in Spider and Web, but it’s not particularly suited to most games. Still, I include it here as an example of the most minimal possible form of conversation.
Talk To. When your player wants to communicate, he types >TALK TO JONES and the conversation takes place without further interference from him. The advantages are, first, that you can put realistic, situation-appropriate dialogue in the mouths of both PC and NPC, and that you don’t have to worry about parsing anything funny at all, just about disabling (if they’re implemented in the language of your choice) ASK and TELL. The disadvantage is that it leaves relatively little power in the hands of the player (whose only choice is whether to have the conversation or not to have it). TALK TO also locks the player into whatever characterization you have chosen for the PC. Stephen Granade’s Common Ground exemplifies both the advantages and the drawbacks very well, I think. Ian Finley’s Kaged, Kathleen Fischer’s Masquerade, and assorted others have also made use of TALK TO.
This is an arena in which the writing skill of the author will make or break the game. If you are a skilled author capable of conveying a fair amount of character and emotion even in these set pieces (as Ian Finley, for instance, does, in my opinion), then you may be able to maintain a sense of immersion.
Menu Conversations. When your player wants to communicate, he types >TALK TO JONES and there appears on his screen a menu of three to six sentences he can say at the moment. Perhaps there is also an option to say nothing. Jones then replies, and the player may be given another menu, and so on, until the conversation ends. Photopia worked on a system like this, and some the existing libraries for handling menus in inform are at least partly derivative of the Photopia ones.
The advantage here is that again the author has control over the form of communication. You can hand the player a bunch of clever quips to say, which characterizes the player character as well as the NPC. (For player characterization that takes place to a large extent in the library menus and the PC’s deployment of them, see Rameses, by Stephen Bond.)
The problem is that menu systems are fairly restrictive; sometimes the menu doesn’t contain anything that the player wants to say, and there’s no way to change what’s on the menu, or even the illusory feeling of freedom that comes from typing >ASK JONES ABOUT THEOLOGY, even if no response has been implemented.
Another problem is what Duncan Stevens has referred to as the lawnmower effect. If you give me a series of menus, I don’t have to do any work to get through the conversation, and I can methodically (using undo, for instance) go back and replay different variations, taking now the first and now the second path, until I am sure that I’ve seen the whole thing. The NPC is then finished, with no more thought on my part than I give to methodically mowing a lawn.
From my point of view, this lessens involvement. If you are writing a highly directed game like Photopia or Rameses or Being Andrew Plotkin — preferably something so vividly written that the story or the humor of the narrative will make me want to move forward as rapidly as possible — then this may be right for you. If you’re writing a game based on investigation, allowing the player to shape his own character, or leaving large stretches of the plot in the player’s hands, then you may be better off with something more open-ended.
Ask/Tell. ASK/TELL is the standard built into Inform, and is the most common form of NPC interaction in the Infocom games and some other old-school works. It allows the player to ASK or TELL the NPC about any keyword he chooses, and get a response. The approach is more flexible for the player than a menu conversation, and works better with knowledge-based puzzles where the player may be discovering and ASKing about new information as his understanding improves. On the other hand, it is typically more difficult to code a conversation which appears to have a natural flow with ASK/TELL than with a branching menu, where you can ensure that each remark rationally follows from the conversation path that has led up to it. It can also be a challenge to avoid guess-the-noun problems, where the player is required to think of the specific keywords the author had in mind in order to advance the game.
ASK/TELL conversation can also mean minimal characterization of the PC. With a menu system, the player sees the PC’s dialogue; with ASK/TELL, he may get only responses from the other character, like so:
>ASK FRED ABOUT THE PORSCHE
“It belonged to my uncle,” Fred replies. “Don’t tell me you want to borrow it too.”
There’s no indication here of what the player’s character might have said. We can code around this and fill in what the player says too, if we like:
>ASK FRED ABOUT THE PORSCHE
“Say, that’s a beautiful machine,” you say. “Where did you get it?”
“It belonged to my uncle,” Fred replies. “Don’t tell me you want to borrow it too.”
Nonetheless, the risky part of this is that the player may not have intended ASK ABOUT THE PORSCHE to mean quite what we’ve decided it means. Maybe he wanted to ask about something else — the mileage, how well it drives, whether it’s for sale, why there are blood stains in the back seat. Providing dialogue for the player in this context adds player-character attitude and characterization but at the expense of some of the player’s sense of control.
Conversely, with ASK/TELL it is hard to allow the player to express those more complex ideas if he wants to. Usually the game won’t accept more than one or two words there, so >ASK JONES ABOUT THE TIME OF THE MURDER is likely to fail flamingly. The 2000 Comp game 1-2-3 tries to get around this, but it does so by prompting the player, and it is remarkably inflexible about WHICH long string of words it will accept.
Topic Words. Used in games such as J. D. Berry’s SmoochieComp game Sparrow’s Song and the older comp game She’s Got a Thing for a Spring, this functions very much like ask/tell, except that the verbs themselves are omitted. The player simply types a word he wants to bring up and the conversation proceeds accordingly. This isn’t so much, in my opinion, a new interface as it is a slight streamlining of an existing one: the author does not need to code separate answers for ASK and TELL, and the player does not need to try both verbs.
ASK/TELL with context and special topics. TADS 3 comes with a conversation implementation that adds to the standard ASK/TELL system in some important ways. First of all, it adds an idea of conversational context, so that the game keeps track of whether the player is currently conversing with anyone. If the player tries to speak with someone without saying hi first, the game may generate a greeting; similarly, the NPC may say goodbye when the player walks out on a conversation in progress.
Secondly, TADS 3 introduces the idea of special topics. A special topic usually involves an entire phrase (like “ASK JONES ABOUT THE TIME OF THE MURDER”), and the player is given a hint when such a special topic becomes relevant in the conversation, with a message like “You could ask Jones about the time of the murder or tell him about the smoking gun.” For example, from Eric Eve’s All Hope Abandon:
The blonde woman turned round just as you joined the queue and asked, “So, how are you enjoying the conference?”
(You could say it’s great, or be unenthusiastic.)
> be unenthusiastic
“Well, to be honest, I’m not a great enthusiast for conferences,” you confessed, “and I’m not sure this one has changed my mind so far.”
“That’s a shame!” she laughed, “But maybe our star speaker this afternoon will enthuse you more – are you looking forward to him?”
(You could say yes or no.)
> ask woman about conference
Answering the question she asked struck you as being an elementary courtesy, especially since you wanted to create a good impression.
“Professor Wortschlachter, you mean?” you replied, “Yes, his topic looks very interesting.” That was a lie, of course, but you didn’t want to appear too negative.
“You think so?” she replied, “I’m not so sure – to be honest, I haven’t been all that impressed by his books.”
Special topics do not preclude the use of single-keyword ask and tell, but they introduce some of the specificity of menus to those parts of a conversation that most need them. At the same time, they avoid dropping into an alternative mode of user interaction. Some players find it jarring, when most of their input is in the form of textual commands, to be asked to click on a menu or select a number from a list: it can be an unwelcome reminder that the game is just a game. The special topics system avoids this kind of uncomfortable transition.
Modified Menu/Topic Hybrid. This system combines the freedom of ASK/TELL with a menu system. When you begin a conversation with someone, you see a menu of the possible things to say listed in the status line, and you may say one of them simply by typing the corresponding letter. If, however, you would like to change the subject, you may also type >TOPIC DIAMOND NECKLACE, and a new menu appears. For instance, >TOPIC JONES might bring up a menu
1. Have you seen Jones anywhere?
2. What does Jones do here?
3. How long has Jones been working for the company?
4. What is your opinion of Jones?
This gets rid of the lawnmower problem and forces the player to take some initiative in choosing how the conversation will go. It also means that you can allow the player to ask questions much more complex than are available in an ASK/TELL system, but without completely giving the game away by including questions like >ASK THE QUEEN WHETHER IT IS TRUE THAT SHE STOLE THE PRINCESS’ DIAMONDS into a single main menu with >QUEEN, HELLO and >DO YOU KNOW WHERE I COULD GET MORE OF THIS SCRUMPTIOUS CAVIAR?
One major drawback of this system is that it requires more writing to implement usefully than any other. Simple ASK/TELL usually means that the author has to write two responses for character for each major topic of conversation; if the plot is very complicated, or it’s possible to get the NPC into more than one state of mind, then the author might have to write some variations on these responses, as well. With the menu system, in order to give the impression of a full implementation, the author winds up writing several questions and answers per topic per character. This can rapidly slide into the realm of the ridiculous.
Chatbot-like ‘natural language’ input. Chatbot programming often involves the recognition of phrase structures, like “Who is *?” or “What is *?”, .
Jon Ingold’s game Insight allows the player to type complex natural questions such as BOB, WHO IS YOUR WIFE? or FRED, WHY ARE YOU ANGRY? Since I haven’t seen the source code, I’m not sure exactly how this works, but I assume that it involves picking out the keywords (WIFE, ANGRY) and identifying a type of question (WHO, WHY, etc), and triangulating on an appropriate response from there. To take a relatively spoiler-free selection from early in the game:
>man, who are you?
“My name’s Mackenzie. But I, er, guess you already knew that. What do you want to know? You know it all already, right? I’ve been working – living – in Olympia. I’m a genetic designer.”
>mackenzie, what is your name?
“You already know my name, of course you do,” he replies.
>mackenzie, where is olympia?
“Nice enough place, I guess,” he says. “We have a lot of problems with the windstorms because of the nearby mountains. I’ve been working on solutions for that, using plants.”
>mackenzie, do you come from olympia?
– Please be more specific about what you want to say.
>mackenzie, who else lives in olympia?
“I’m sorry,” Mackenzie replies. “I didn’t quite follow that.”
When it worked, I found the effect pretty fascinating. Where the parser realizes it can’t interpret, it can give a relatively satisfying excuse. It’s the middle case that’s the most disconcerting, where the game mostly understands but misses some critical nuance. The response for “where is Olympia?” doesn’t sound exactly right; it seems to have caught the keyword “Olympia”, but not to have interpreted “where?” properly. And the “Please be more specific” line isn’t helpful at all.
I don’t mean to seem too critical, because this game is attempting something extremely difficult. The problem is that faking a natural language understanding always leaves some notable gaps. One might expect that it would be easier to write a chatbot for an IF game (where there is a small modeled world whose state is described within the program) than one whose domain of conversation is real life. Within the IF world, the chatbot has access to the same facts about the state of the world that the player does. On the other hand, in IF the performance demands are, in a sense, higher: what can be forgiven in a chatbot becomes a bit more serious in an IF game, where the success or failure of an interaction determines whether or not the player will be able to see the rest of the plot and finish the game in a satisfying way.
Meta-conversation Verbs. Adam Cadre’s Varicella uses a form of modified ask/tell that allows for a little more player control of the PC’s behavior. The ask/tell system works the same way as ever, but you are allowed to adopt one of three tones of voice: hostile, cordial, and servile. To take an example from the very beginning of the game:
You adopt a cordial manner.
>ask steward about nails
“How’s the manicure proceeding?” you ask.
“Shouldn’t be much longer, sir,” the steward says.
The steward expertly attends to your fingernails with an emery board.
You adopt a hostile tone.
>ask steward about nails
“How much longer is this going to take, you mediocre manservant?” you bellow.
“Shouldn’t be much longer, sir,” the steward says.
The steward lightly blows on your fingertips.
You adopt a servile posture.
>ask steward about nails
You’re scarcely about to address a common servant in an obsequious tone. For heaven’s sake, where is your self-respect?
Reactions to this system have been mixed. I found it entertaining to go around seeing what interesting variations on the various statements I could get by changing my tone of voice, but I also frequently forgot to set the tone correctly and found myself acting inappropriately. And the more engrossed I was in the game, the more likely I was to forget about the tone system, which meant that I used it more as a toy than to get at the actually interesting variations that I understand are buried there as a result.
Another experiment along similar lines is Forever Always, which permits the player to use adverbs to control the tone of conversation. The player can, for instance, WHISPER HUSKILY, SHOUT ANGRILY, SPEAK POLITELY, etc. A menu of options appears, and its contents depend on what manner of speech you chose. This system, unlike the one in Varicella, lets the player see what he is going to say before he says it, so the effect of the different tones is a bit more obvious. The game is not flawless, but the problems in the later scenes seems to stem more from bugginess and lack of testing than from problems with the system as such, which is fairly interesting.
Both Varicella and the Forever Always have systems designed for a game where emotional states and relationships between characters are of primary interest; one’s a palace intrigue, the other a romance-novel parody. The Forever Always system might not be at all successful for a game that centered on information gathering, since the player isn’t allowed to specify keywords, and there’s no potential for following up on, say, the clues of a mystery. On the other hand, I think it actually works better than the Varicella approach for the specific and limited purpose of doing character-emotion-based IF. (Since Varicella is partly about discovering information, the adverbs-only approach wouldn’t have worked there.)
A word of caution, however. It’s important, when expanding a conversation system to include new verbs, not to leave the player with an unmanageable number of options. In Varicella it was possible to keep track of the three tones of voice, but other suggestions I’ve heard (such as a >BE SYMPATHETIC command) or tried to implement myself (a system including COMFORT, INSULT, APOLOGIZE, FLIRT, SEDUCE, SMILE, LAUGH…) suggest perhaps-unmanageable systems.
The Model: Representing Conversational Information Internally
So far, we have looked only at how the player will communicate his instructions to the game, and not at all about how the conversation will be modeled internally. To go further, I also want to define a few pieces of jargon for the purposes of clear discussion:
topic: a subject of conversation, such as “the weather”, “religion”, “employment”, “the Red Sox”, etc.
fact: a proposition about a topic, such as that Red Sox have lost the ball game, rain is expected tomorrow, etc.
quip: the actual verbatim dialogue used, such as “The Red Sox were let down by their bullpen again this afternoon” or “Seattle can look forward to its thirty-ninth day of rain tomorrow.”
effect: the result of saying a given piece of verbatim dialogue beyond merely expressing information, such as causing the non-player character to become sad, committing the player to do something for the non-player character, etc.
conversational goal: something the player or NPC is trying to achieve through the conversation: to find out a specific piece of information, to get one of the characters into a given mood, etc.
scene: a particular section of the plot; the responsibility for deciding which scene is in progress typically lies with plot-modeling algorithms that consider the whole state of the world and are not part of the conversation system per se.
Even a very simple conversation model usually represents at least one of these elements in code. Complex models may treat several elements at once, or may apply more rules to determine what the player is allowed to say when.
Three Traditional Models
Topic Quips. Traditionally underpinning the ASK/TELL system is a model in which any given topic of conversation is tied to a single response. This might be implemented with a table or switch statement that exactly matches content from the player’s command, or it might treat the topics internally as modeled objects, but there is usually a one-to-one match of quip to topic for any given NPC. In this sense, the conversation is like looking words up in the dictionary: the replies will always be the same, and there is no sense of continuity, of conversational context, or of a rapport established between the player character and the NPC. A slight variation on this is to have a few keywords that the NPC will not talk about until first adequately bribed: the NPC is still a dictionary, but a few entries are written in invisible ink.
Quip Tree. Traditionally underpinning the menu system is a model in which dialogue is a branching tree. At the first node, you may pick A, B, or C to say; if you pick A, you’re then confronted with a choice of D, E, or F. Dialogue flows, since the player is never confronted with the option to say anything in any order other than the one specifically anticipated by the author. On the other hand, the player’s freedom is constrained significantly.
Scene Quips. Traditionally underpinning the TALK TO system, this model shifts the burden of conversation context entirely to the plot model: each scene offers the player a single pre-written exchange with a given NPC. This leaves the player with almost no freedom, except inasmuch as he can affect the rest of the world to bring about new scenes in which new dialogue is appropriate.
What the Traditional Models Miss
These models of conversation are legitimately popular, especially in work where NPC interaction is not the most important aspect of game play: they are simple to understand and relatively easy to write and extend. Adding new keywords or dialogue branches does not require much work on any other pieces of the system.
The problem becomes much harder when we want to devise a model that combines player freedom with a sense of developing context. Now we have to be able to keep track of what has already been said in the conversation, model the effects of the exchange on the NPC (and perhaps on the player, for that matter), and determine what can legitimately be said next. We might also want to take into account some external information about the world state: what stage of the plot we are in, what the NPC has seen the player do, and so on. Here are some of the design concerns that arise with one or both of the traditional models:
Avoiding Repetitious Dialogue. One of the least person-like habits of the typical IF NPC is that he always answers the same questions in the exact same words, regardless of how many times the player has asked. This issue need not come up in quite the same way in a menu-based conversation, since you could disable questions that have already been asked, whereas there’s no good way to prevent the player from typing >ASK JONES ABOUT HAT ten times in a row.
At its most basic level, this is just about preventing the NPC from saying the same thing over and over and over again. Real people don’t repeat the same words in the same language a hundred times in a row, and it detracts from the feeling of realism if your NPC does. There are several options for dealing with this: having the parser cut in and say, “You remember that Jones told you…”; having Jones tell you again but in a slightly modified form using some kind of randomization of text (so that over time you would get similar text over and over, but it wouldn’t be identical each time); or describing the conversation without telling you the exact words (“Jones tells you again that…”). Alternatively, if Jones is a feisty sort of person, he can complain if the player asks him the same question multiple times. This is dangerous, though, since if Jones has important information to impart, the player may find himself stuck because he didn’t take notes the first time through the conversation.
Contextually-based Reactions. In real life if you’re talking to someone and that person starts to read a book, you may take a message from the fact. Likewise, there are spots in conversations where it may be more or less appropriate to react to the other person with advances (>KISS JONES) or violence (>KILL JONES WITH ROCK). If you have a system of conversation that tracks what the current topic of conversation is, and whether anything is actively going on, you can use it to tailor appropriate reactions for KISS, GIVE, SHOW, HIT, et al.
Somewhat more subtly, context in conversation can also be used to interpret the meaning of the player’s keywords. For instance:
[The PC and Inspector Lynley have been discussing murder victims.]
>ask lynley about veronica
“Do you think it could possibly have been Veronica?” you suggest. “I overheard her arguing with the victim last night.”
As opposed to:
[The PC and Inspector Lynley have been chatting about their love lives.]
>ask lynley about veronica
“How well do you know Veronica?” you ask. “I’d like to ask her out, but I’m not sure whether things are really over between her and Marcus.”
This kind of refinement is irrelevant in a menu-based conversation, but for ASK/TELL it can lend a sense of depth. It takes some work, though, to make sure that really important questions never become entirely impossible to ask just because the conversation context is set wrong. If the player desperately wants to accuse Veronica of murder, he’ll be frustrated if the game only permits questions about her love life.
Abstract Knowledge. One of the artificial abilities we might like to give our NPCs, aside from the ability to wander around a map intelligently and carry out complex goals, is the ability to understand what they are told: to keep track of what items of knowledge they have so far, use them to change their plans and goals, and even draw logical inferences from what they’ve learned.
Purpose. NPCs give the impression of being much more active and thoughtful if they show signs of having a private agenda of their own — which may include raising new conversational topics, deciding to cut a conversation short, and so on. There’s a trade-off here again: the NPC who takes actions and doesn’t wait for the player may seem more dynamic and alive than an NPC who sits around being questioned at the player’s whim, interspersed with turns of the PC taking inventory, looking under sofa cushions, and unlocking safes. And if you have a specific set of information you need to convey to the player, sometimes it’s useful to have an NPC who will just keep coming back to that topic until it’s been adequately covered.
A Few Alternate Models, and Thoughts on Model Design in General
What follows is a discussion of some alternative conversation models I have tried, my reasons for trying them, and how well I thought they worked. I focus on my own work here because discussing these requires some understanding of the code base; it is not always possible to tell with any certainty how someone else’s game is modeled internally.
Topic Quips with Mood Tracking and Quip-tagging. Galatea has a number of topics. Each of these will produce only one quip of response at any given time: the interface is ASK/TELL, so there are no quip options presented to the player. However, quips have a variety of effects, especially on Galatea’s mood and position, so that the state of the conversation is in constant flux; and the state of conversation in turn affects which of the available quips is used when topics are mentioned. Quips that have been used are tagged as used so that they will never be repeated; some quips can be used only after other quips. There is no systematic tracking of facts, even though certain facts do come to light and have a profound effect on the state of the game (but this is programmed by checking whether any of the relevant quips have been used).
This is somewhat shaggy system and challenging to extend and maintain, and it does not entirely protect against contextual breaks where the flow of conversation is lost. Moreover, Galatea herself is mostly reactive rather than active. There are a few points where she is specially programmed to make a follow-up comment if the player does not speak about something on the next turn, but for the most part she tends to be silent until spoken to.
Topics with Multiple Quips. In Pytho’s Mask, an assortment of different quips are associated with different topics. Interaction is handled through a topic-menu system, so when the player asks about a topic he is given a selection of all the currently-relevant quips associated with that topic. Quips are marked when used so that they won’t be repeated unless it is particularly desirable for the player to be able to re-ask a question; sometimes in that case there is one quip used for the first time the question is asked and an alternate form for subsequent askings. Using a quip can also have the effect of changing the topic, as well as producing emotional responses.
This model arose from experiment with the topic-menu interface, used here for the first time. However, it is possible to use a different interface with this kind of model, as demonstrated by Kathleen Fischer’s “Redemption”, and the enhanced ASK/TELL conversation system built into TADS 3; these input systems are probably best for cases where the number of quips per topic is sparse, even if it is not one-to-one.
Topics with Multiple Quips and Abstract Facts. The model in Best of Three is designed to support an NPC whose conversational goal was to discover information from the player by asking questions and drawing logical inferences. Like Pytho’s Mask, the model associates a number of quips with each topic, and uses a topic-menu system to present this to the player. However, it also implements separately a tree of facts; a quip can indicate one or more of these facts.
The tree structure represents the way in which the NPC will draw inferences. He is curious about certain facts, and has the ability to ask questions about them or direct the conversation, since he is allowed to choose and speak a quip of his own after answering any quip offered by the player. When the NPC has learned all the facts underlying one node, he then infers that that node is correct; he may ask the PC a question to verify his conclusion, but essentially the reasoning process is complete.
In practice, the game turns out to be not very much fun to play. The system of inference is cumbersome, and it is not always obvious to the player that a reasoning process was going on behind the scenes, rather than a prewritten script. Moreover, conversation always tends up the factual tree to arrive at the same goals in the end, so despite the dynamic internals of the game, the difference between play-throughs is usually a matter of reaching the same quips in a different order, rather than entirely different lines of discussion. The NPC’s behavior might have been more interesting if he had not driven the conversation so relentlessly (too much NPC autonomy makes the player feel helpless), and if the inference system had caused more complex behavior, making it more obvious how the NPC was responding to changes by the player.
Topics with Cross-indexed Quips. City of Secrets uses an elaboration of the multiple-quip implementation described above, except that quips can be associated with multiple topics, as well: asking about any of the topics covered by a quip will make that quip available. Moreover, topics are nested, so that topics about specific items (like a particular character or place) are treated as sub-items of general topics (such as an entire group or region). When the player runs out of quips to say about the current topic, the game explores whether any quips are available for the more general subject, and so on. The result is only partially successful at providing a sense of continuity and keeping the player constantly prompted with possible things to say: I wanted to avoid having the topic menu become empty of quips any more often than necessary.
To make matters more complicated, there are a few meta topics representing abstract actions such as >INSULT, COMPLIMENT, and the like: quips are associated with these topics because of their effects rather than because of their content, but otherwise the command >INSULT functions much like >ASK JONES ABOUT INSULT: every insult-related quip that is currently available becomes accessible for the player’s use.
Contextual determinations are messy and not handled very systematically. There is no representation of facts as such. Quips can be tagged to indicate that they could follow only immediately after other quips (emulating the effect of a dialogue tree in small) or only after other quips had been used (but not necessarily immediately before). Moreover, arbitrary information about the game state is sometimes used to determine whether a quip was available for use. Finally, some quips are associated with specific NPCs while others can be used with any member of a class of NPCs (e.g., any shopkeeper, or any member of a political faction).
NPC conversation goals were coded in a similarly ad-hoc way: the plot was divided into scenes, and during a scene the NPC might have a script of quips to present to the player. The player had some flexibility in that he could delay the script by asking his own questions or (sometimes) changing the subject, but the NPC would revert to the main script when the player did not take action or ran out of available quips.
This is by far the most complicated system I have ever constructed, and it was, frankly, out of hand: hard to program and even harder to maintain. It did provide a certain richness of interaction, since the player had a lot of freedom to change the subject and to give abstract commands as well as concrete ones; nonetheless I believe that similar effects could be achieved better in other ways. One particular failing not only of this model but of the entire game is a lack of focus: because I was insufficiently clear in my mind about how I wanted the player to be able to interact and affect the plot, I tried to implement “everything reasonable.” As a result, unsurprisingly, play is not always very well directed. Throughout the project I struggled to produce enough material, to handle the ramifications of massive combinatorial effects, and to keep pacing problems at bay.
Database Queries (Multi-fact Topics). This model works with a chatbot-like interface, and the player’s input is scanned for keywords and standard sentence forms: input such as SAM, WHERE IS MR GREEN will trigger on “where” and “Mr. Green” and dynamically generate a response from Sam on the whereabouts of Green, the topic. Some five or six categories of fact are provided for each topic; because there is so much to ask about, quips are not all pre-written, but are made up on the fly.
This system was used for parts of “Mystery House Possessed” (a game that actually implemented several conversation models for NPCs of differing intelligence — see below as well). The game presented the player with a dynamically generated mystery in which a randomly-selected NPC murdered other NPCs in turn, leaving clues behind; the ability to ask specific questions was intended to assist in investigation. A drawback, of course, was that response quips to the database-style queries tended to be fairly lacking in personality, though I made some attempt to add local color when the speaker might have strong feelings about the topic.
Two-topic Quips. In this model, also used in “Mystery House Possessed,” the player’s input is scanned for topic keywords and the last two distinct topics are used to select a quip from a table. The idea is that the interesting things to say are about relationships between topics (how does Daisy feel about Tom?) rather than in the topics themselves; this seemed appropriate in a context where I wanted to supply considerable amounts of information about how people reacted to one another.
This system provides a very limited kind of context, as well. The player might say >ASK SAM ABOUT WORK in one sentence and in the next >ASK SAM ABOUT JOE, in which case the second answer will trigger on WORK and JOE: Sam will give a reply about Joe’s employment, rather than some other aspect such as Joe’s love-life, attitude to another character, health, etc. A selection of concrete topics were provided (mostly the characters in the game, plus a few who were missing) as well as a few abstract topics (work, love, etc) that might be thought to have some bearing on people’s motivations. It was also possible to say >ASK SAM ABOUT JOE AND TOM explicitly in order to query a relationship in a single move.
The results were interesting, but, as always, it is dangerous to permit the player to use a chatbot-style interface in IF; it’s all too easy for him to get completely nonsensical reactions out of the system. Moreover, while the system offered good continuity over a couple of moves, it was not so good at producing the sense of an evolving conversation over a long time. This was less a liability in “Mystery House Possessed” than it might have been in some other games, because MHP involved a number of mentally unstable people who might be expected not to converse rationally, and also because the pacing of the game precluded very extended conversations with anyone: other NPCs kept being killed, discoveries made, characters wandering from room to room, and so on, so that the number of turns that could be spent drawing out a single conversation was limited. The flaws of the system would have shown more severely had that not been the case.
Topics with Interstitial Quips. In this model, used for “Glass”, quips are associated not with specific topics but with the transition from one topic to another. Order of topics is important here, where it was not important in the two-topic quips model above. When the player or an NPC mentions a new topic, the model looks up the current topic and the last topic, finds a quip associated with that specific conversational transition, and prints the result. The quip used is then erased from the table in order to avoid repetition. The conversation is divided into a series of scenes, and in each scene the NPCs pursue a conversational goal, namely, to move the conversation to a given topic. The player can interfere by changing the topic himself. This model allows for a simple AI implementation to handle the other characters’ conversational initiative: conversation topics are defined to be related to one another (much as one room on a map leads to another), and the NPCs use a pathfinding algorithm to discover the current best path from the current topic to their goal topic. Facts, on the other hand, are not explicitly modeled. Major state changes in the conversation, where the characters can be assumed to be using a new set of information and moving towards a new endpoint, are modeled as scene breaks; so each scene might be understood as a contextual domain in which certain facts are known and certain attitudes are at work.
The result is a game that moves quickly and fluidly and does a good job of preserving contextual flow: because quips are tied to a before and after, it is hard to reach one that does not follow from what went before. NPC AI is easy to write for this, as well, because the conversational goal of the NPC can be easily expressed in terms of pathfinding through relations inherent in the model.
On the other hand, player freedom is somewhat diminished (as in Best of Three) because the NPCs direct so much of the conversation. Play-throughs are markedly different only if the player succeeds in intervening at an important point, rather than being as free-form as in Galatea.
Topic Quips with Tight Scene Correlation. Used in “When in Rome 1,” this model uses a traditional ASK/TELL interface and provides one quip per topic of conversation per scene, but changes scenes very frequently. Thus it is difficult for the conversational context to change very much within the space of a scene. This essentially shifts the burden of context-tracking so that it becomes part of the plot model rather than part of the conversation model. This works, but is probably put to best effect in games with a focused, fast-moving narrative.
Quip Menus with Missions, Abstract Facts, and Trust Effects. This model, for an unreleased work in progress, is designed to present a dialogue menu interface for a CRPG. Since there is no ASK/TELL input, no topics are modeled as such. Instead, we model missions that the player is currently working on or being requested to undertake, and this largely determines the context in which different quips become available: some things can be said only when the relevant mission is in progress.
NPCs pursue conversational goals by offering missions to the PC, inquiring after missions already in progress, or asking questions about the PC’s intentions and loyalties. The player pursues his conversational goals by asking for information that will help him solve missions; his ability to find information is partly determined by how much he is trusted by his interlocutor. Quips that have been used once are occasionally discarded if they can’t be repeated, but more often they are instead moved to the base of the dialogue tree, so that the player can ask again any question he has already reached and asked once. This means that after he has gone to the trouble of finding something out through careful investigation, he is not required to repeat the entire process.
Quips may also be associated with facts. Not every quip has a related fact; facts are modeled, sparsely, where they are of particular importance. Some facts are said to exclude other facts — that is, if one is true, the other must be false. When an NPC asks a question of the player, the player may respond with quips indicating facts, or with evasive quips that put the NPC off (“none of your business”, “I don’t remember”, etc.). The NPC next considers what he has heard from the player so far and determines whether it is internally consistent. If he has heard a consistent fact from the player, his trust rises. If he has heard an evasion, his trust remains the same. If he has heard a fact that excludes or contradicts an earlier fact, his trust drops, he indicates that he has noticed an inconsistency, and he asks the player to clarify his position.
This model (though in an unfinished work) seems to be satisfactory so far in producing dynamic menu dialogue, and providing situations in which the player must acquire and share information strategically. Revealing information can help the player in the short term by providing a trust boost, at the expense of diminishing his later options to lie successfully. Dialogue repetition in this genre is considered a less serious flaw than it generally is in textual interactive fiction, and therefore we are not concerned about rephrasing things that have already been said once.
The point of this overview has not been to recommend any particular system of conversational modeling, but rather to suggest a few ideas.
First, while some models best support some interfaces, the correlation is not simple, and it is worth thinking about the model explicitly rather than throwing it together ad hoc.
Second, a “conversational model” can treat facts, topics, or quips as the basic units of conversation; or it can model combinations of these; or it can work with other base units. I have considered, but never attempted, a conversation model whose primary elements and actions would be emotive rather than verbal, so that, for instance, a quip might represent movement from an angry state to a happy state, rather than from one topic to another topic; NPC goal seeking might also involve seeking a series of quips that would lead to the desired emotional outcome. In short, there are numerous possibilities. Keep an open mind; if you find yourself designing a system that requires a great deal of special-casing (as I did with “City of Secrets”), stop and ask yourself what these special cases are usually accounting for, and whether you can model that aspect of the world systematically instead.
Third, it is important to focus on the kind of interaction you expect the player to do, and (if you intend some degree of AI for your NPCs) the kinds of conversational goal-seeking you expect of your NPCs, as well. It is easy to be seduced into adding all sorts of things to the conversation model that are not strictly necessary and that produce horrible complexity. For one thing, adding new features to the model usually means writing much more content later on (and possibly spending more time debugging, too).
Output: Sharing Information with the Player
When we build a conversation model, we need to think not just about levels of implementation and our choice of conversational input; we also need to consider how we are going to represent the model’s output to the player, to make the most out of our simulated world and give the player enough information to make meaningful choices.
Offering Recaps. If past conversation affects the current state of the game, or if there are substantial amounts of information that can only be gained from conversation, it is often wise to provide the player with some way to review what has already been said without taking extensive notes. To this end, some games offer commands like RECAP, REMEMBER, or THINK ABOUT, allowing the player to recall what was said about specific topics or to review what topics have been discussed in general.
Exposing the Mechanism (partly). If we’re modeling a character’s mood as something separate from the conversation exchanged, we want to let the player know in subtle ways when the mood has changed; we may also want to let the player get a sense of the character’s attitude by examining him. Gestures, facial expressions, and tone of voice can all be described as part of the flow of dialogue but (if necessary) implemented separately. Galatea, for instance, has conversational replies with blank spots that essentially mean “insert an appropriate gesture here”: she might use the same set of words but a different movement or tone of voice depending on the overall atmosphere of the conversation.
Some games with a partially graphical component have tried using an image of the NPC to convey current mood or expression: Chris Crawford’s Erasmatron dynamically created facial expressions, and Façade uses body language and faces in addition to dialogue to convey the moods of the characters. Multimedia IF has not done too much with this possibility so far; but, on the other hand, moods and emotional reactions can be expressed textually as well.
Similarly, if the NPC is using some kind of logical model to draw conclusions or pursue goals, it may be worth making that fact explicit as well: when the character realizes something, tell the player what he realized and why.
This may seem terribly unsubtle; indeed, it goes against an accepted wisdom that one is trying to build a mechanism that doesn’t appear mechanical, and that the ideal end result will be a conversation that feels real but in which the player is never conscious of how it works. That goal is an interesting one to pursue, but in my opinion (and there are other opinions out there), most interactive fiction is better served by a model that gives the player some clues about what is being modeled and how he can interact productively. If you show the player that the characters are drawing logical deductions from his statements (say), then he will realize that choosing what facts to reveal is an important part of the game, and pursue that angle rather than others. If you can focus your players on the kinds of interaction that you’ve anticipated and written for, they’re more likely to enjoy the work and less likely to run into the boundaries and weaknesses of your system.