This is one of several design articles about the new interactive narrative platform Versu, which Richard Evans and I have been building with a team at Linden Lab.
Any platform focused on social interaction needs strong conversation handling. The following article goes into a certain amount of technical detail about what the system does and how it works, and Richard kindly agreed to write about the sections on which he had the most design influence.
Richard and I each had some examples in our previous work to draw from, but they needed substantial revision to work for Versu. Moving away from a turn-based parser-driven model means that we could dispense with the contextual parsing challenges that take up so much room in the Threaded Conversation system for Inform; on the other hand, it introduces a whole other dynamic. In a multi-agent story, conversation flow has to be explicitly controlled; we have to track whose turn it is to talk next (if anyone’s), enforce rules of topicality; we have to give characters the option to interrupt others, and appropriate responses if they do so.
In addition, conversation in Versu scenes needs to be interleaved with other character behavior. Characters might be talking while dancing, or eating dinner, or during a fight; so we need to provide to make conversation flow around the other social activities that are occurring at the same time.
Much of the design work on this aspect of the system was done by Richard Evans, based on Harvey Sacks’ studies of conversational interaction.
Richard describes the design here:
Membership Categorization Devices
Versu models multiple concurrent social practices. For example, during a conversation at dinner, there is a social practice modeling the dinner itself (providing affordances to eat and drink), and another practice modeling the state of the conversation (whose turn it is to talk, what topics are currently salient). Each practice assigns a character a role. So when there are multiple concurrent practices, each character is simultaneously assigned multiple roles. He is playing many roles at once. This makes the social interaction more like real life, adding depth to each character.
For each role, we can ask: how well is he performing that role?
For example, Mr Darcy is:
- A loyal friend to Bingley
- A kind brother to Georgiana
- A lousy participant at the ball
For each role evaluation, we also store reasons:
- Darcy is a loyal friend to Bingley because he tries to look after his best interests and give him advice, even if it is unwanted
- Darcy is a kind brother to Georgiana because he is very supportive
- A lousy participant at the ball because he refuses to dance
Evaluations, and the reasons which justify the evaluations, can be communicated from one character to another. For example: in an early play-test, I was wondering why the doctor was being so rude to me when I had never spoken to him before. I found out, after much debugging, that the reason for his rudeness was that I had been mean to the butler, and the butler had been gossiping about me in the kitchen.
Choosing how to Respond
When someone says something, the others (both player characters and NPCs) have a choice how to interpret it; replies to a particular action are not automatic.
In Versu, a response is just a normal type of action. A character uses the same utility-planner to choose a reaction as he does to choose any other type of action. For example, at dinner, Miss Bates makes a sycophantic remark about the wine. Brown can choose to interpret this remark as:
- betraying low breeding
- or we can ignore the remark altogether
Each of these interpretations involves updating one of the role-evaluations described earlier:
- If he decides she was betraying her low breeding, then he will evaluate her harshly as a member of the gentry
- if he decides she was being polite, he will evaluate her positively as a polite guest
Your Response is Itself a Public Act, Witness-able by Others, itself Susceptible to Further Responses
When a character decides on a particular way of interpreting another’s actions, this decision is itself publicly witness-able, and susceptible to further evaluation. If, for example, you (playing Brown) decide that Miss Bates’ sycophantic remark betrays her low-breeding, then others might interpret your displeasure as:
- properly critical (displaying good judgement)
Because evaluations are themselves subject to further evaluations, a trivial initial remark can snowball into something dramatic, with consequences piling on consequences.
Respecting and Violating Norms
Every social practice comes with a set of norms: things the participants are expected to do.
For example: at dinner, you are expected to finish your food. During a conversation, you are expected to respond when asked a direct question.
Now we want the NPCs to (usually) respect these norms, but this has to be handled delicately.
We don’t want the norm-violating actions to simply be unavailable. If they were, the player would never be able to break out of the social straitjacket. Some players like to experiment with the boundaries of the system, and if a player wants to interrupt a conversation, he should be free to do so. The NPCs should notice, and disapprove accordingly.
The requirements for norm-following, then, are these:
- an NPC should respect the norms
- but any character can violate the norms at any moment – it’s just that NPCs should not typically want to do so
- if a character does violate a norm, it should be noticed. The others should understand that a norm violation occurred, and respond accordingly
These requirements hold in particular for the social practice of conversation.
There are a number of subtle norms involved in conversational turn-taking which we all know implicitly – even if we might find it hard to spell them out in detail.
NPCs tend to respect these norms of conversational turn-taking, because the simulator assigns a lot of weight to norm-compliance when determining what an AI agent wishes to do. (It is possible for an author to remove these norms for individual NPCs, creating someone who talks apparently at random or interrupts — but the effect is typically confusing rather than appearing as plausible characterization.)
Meanwhile, the player must be free to violate the norms if he so wishes – but if he does so, it should be noticed.
Modeling Conversational Context
The norms of conversational turn-taking are spelled out in remarkable detail in Harvey Sacks’ lectures on conversation, and in the seminal paper “A Simplest Systematics for the Organization of Turn-Taking for Conversation”.
We implemented a simplified version of this model in Versu. A conversation has various fields:
- a selected speaker : the person who should speak next
- a set of selected topics: the next person to talk should speak about one of these topics
- a selected speech-act: the next person to speak should use this particular type of speech-act
When a specific character is asked a direct question, or addressed in some other direct way, she is made the selected speaker. She is expected to speak next. Other people can speak out of turn, but it will be noted as an example of interruption, with various consequences. If the selected speaker fails to speak when expected to do so, this also constitutes a (minor) norm violation. People will notice.
Some speech-acts do not set a selected speaker at all. If one character makes a general remark about the weather, for example, anybody can respond. There is no expectation that a particular person will speak next. In these cases, the selected speaker field is cleared. But there are still other norms in play: people should not talk about any old thing – they should continue talking about the weather. The conversation has a set of selected topics describing what may be talked about next. The player is free to talk about something else, but it will be noted as a (minor) norm violation.
In an earlier prototype, each conversation had a unique selected topic. This meant that the player’s choice in conversations was highly restricted. Conversations tended to exhaust one topic before moving mechanically to another. An important modification that Emily made was to expand the conversation context to include a set of salient topics, rather than an individual one.
Each quip can be “about” multiple topics at once, and a conversation stores all of them. This encourages flow and fluidity between topics, while maintaining conversational salience.
Another feature of conversation is that a significant pause in the exchange “wipes out” any existing conversation information. After several turns, conversation information is wiped, allowing characters to introduce new dialogue.
(returning to Emily)
- pieces of dialogue are represented as quips
- a quip may be unique to a single speaker, or may be speakable by any character
- a quip can be said to “directly follow” or “indirectly follow” another quip, which allows some control of dialogue flow where one utterance only makes sense in context of another
- a quip may convey factual information, and be speakable only if the character speaking it believes that factual information to be true
- a quip may be said to be about one or several conversation topics; characters will prefer to stay on topic if possible, unless enough narrative time has passed that that topic can be considered dropped
- a quip may be tagged as “introductory”, which means that it is able to introduce a new conversation topic; other quips are appropriate only if that topic has already been raised
And also nuclear options that let you do pretty much anything, at the cost of making you type more:
- quips may have arbitrary prerequisites for being spoken
- quips may trigger arbitrary functions to follow up
Those arbitrary prerequisites make it possible to attach more character- or genre-specific details: for instance, an employee who doesn’t like to talk about certain topics in front of his boss, or a confidential conversation topic which can only be spoken if everyone else in the room is friends with the speaker.
In addition, because Versu’s model concerns itself with character moods (happy? sad?) and the types of evaluations described above (Ryan is not very bright!), mood and evaluation information can also be conveyed in a quip.
This means, for instance, that we might have a quip where the dialogue is
“So Ryan failed another math exam. Guess we can stop saving for college. Just what I always wanted, a kid living in the basement working at Burger King when he’s 40.”
This quip would be tagged to show that it encodes factual data (Ryan failed his exam), evaluative data (Ryan is not a good student; possibly also Ryan is not a good son), and emotive data (I am annoyed).
All of these types of output then offer reaction options to other characters. This same piece of dialogue might cause:
- Ryan to get angry because he’s heard himself being evaluated negatively. Maybe he makes a rude gesture or storms out of the room; maybe he starts a fight with his father and causes permanent harm to their relationship state.
- Ryan to wonder whether his father might be right about his study skills, and change his self-image to be more self-critical.
- Ryan’s girlfriend to get angry on Ryan’s behalf, because she doesn’t like to hear someone say negative things about her romantic partner.
- Ryan’s brother, who is on bad terms with Ryan, to smirk rudely, because he enjoys hearing negative evaluations of people he dislikes.
- Ryan’s brother to speak a custom line of follow-up dialogue: “Don’t get your hopes up. I hear you have to be able to make change at Burger King.”
- Ryan’s teacher to offer a factual contradiction (“Actually, Ryan didn’t fail.”).
- The speaker’s wife to comfort the speaker (either with dialogue or a physical gesture like touching his shoulder) because he’s demonstrating a negative emotion.
- The speaker’s sister, who has poor self esteem about her own intelligence, to remark that not everyone can be great at these challenges.
- The speaker’s brother to say that there are traits that matter more than intelligence, because expressing a strong evaluation about a role implicitly asserts that that role is an important one (and then other characters can dispute that implication).
- Ryan’s fellow student, who has previously evaluated Ryan positively for intelligence, to reply with a positive evaluation he’s remembered from that occasion (“Ryan understood the dialogue in that Mandarin movie better than anyone else in Chinese class.”).
- Another kid’s parent to look uncomfortable at hearing things that shouldn’t be aired in front of strangers.
…and so on. There are even a few other possibilities that this example doesn’t cover: for instance, dialogue that is meant to be a joke can be tagged as that sort of speech act, inviting listeners to laugh and also to evaluate the speaker as funny, for future reference. And characters can respond to a factual statement not only by agreeing or disagreeing, but in some cases by drawing additional inferences, if one belief has been specified to “imply” another belief.
A listener’s choice of response will depend on that character’s preferences and traits, relationships, and current moods, and the scene as a whole will play out very differently depending on who is in the room at the time, and how people react to one another’s reactions. In addition, other characters may accept the speaker’s factual or evaluative statements and incorporate them into their own world view, so that from then on other characters may also view Ryan as an inferior student who failed his exam.
This complexity comes from a juxtaposition of fine-grained specific features (such as the pre-written follow-up quip) with several general systems. There’s a general system for allowing people to react to things said about themselves, another general system for allowing people to respond to others’ expressions of emotion, and so forth. Drawing from all of these systems means that any given quip of conversation can have a unique combination of effects that would not have been reproduced by saying something else.
Authors can give individual characters unique response styles and strategies: for instance, in the existing stories, there is a character who prefers to make the most negative evaluations possible about his own son, so he will select his dialogue and responses accordingly. Characters can also come into a story with pre-existing relationships and judgments of other characters, self-evaluations, and beliefs about the importance of particular traits (“intelligence is not so important”) that will shade their responses. When those beliefs change, the behavior can change as well.