Game AI


El-Nasr04: Magy Seif El-Nasr and Ian Horswill. Automating Lighting Design for Interactive Entertainment. ACM in Computers and Entertainment, Vol. 2, issue 2, April/June 2004.
Recent advances in computer graphics, particularly in real-time rendering, have caused major improvements in 3D graphics and rendering techniques used in interactive entertainment. In this paper, we focus on the scene lighting process, which we define as the process of configuring the number of lights used in a scene, their properties (e.g. range and attenuation), positions, angles, and colors. Lighting design is well known among designers, directors, and visual artists for its vital role in influencing viewers’ perception by evoking moods, directing their gaze to important areas (i.e. providing visual focus), and conveying visual tension. It is, however, difficult to set positions, angles, or colors for lights within interactive scenes to accommodate these design goals, because an interactive scene’s spatial and dramatic configuration, including mood, dramatic intensity, and the relative importance of different characters, change unpredictably in real- time. There are several techniques developed by the game industry that establish spectacular real-time lighting effects within 3-D interactive environments. These techniques are often time and labor intensive. In addition, they are not easily used to dynamically mold the visual design to convey communicative, dramatic, and aesthetic functions as addressed in creative disciplines, such as art, film, and theatre. In this paper, we present a new real-time lighting design model based on cinematic and theatric lighting design theory. The proposed model is designed to automatically, and in real-time, adjust lighting in an interactive scene accommodating the dramatic, aesthetic, and communicative functions described by traditional lighting design theories while accommodating artistic constraints concerning style, visual continuity, and aesthetic function.
Wiggins98: Papadopoulos, G. and Wiggins, G. (1998). A Genetic Algorithm for the Generation of Jazz Melodies. Proceedings of STeP'98.
Recent advances in computer graphics, particularly in real-time rendering, have caused major improvements in 3D graphics and rendering techniques used in interactive entertainment. In this paper, we focus on the scene lighting process, which we define as the process of configuring the number of lights used in a scene, their properties (e.g. range and attenuation), positions, angles, and colors. Lighting design is well known among designers, directors, and visual artists for its vital role in influencing viewers’ perception by evoking moods, directing their gaze to important areas (i.e. providing visual focus), and conveying visual tension. It is, however, difficult to set positions, angles, or colors for lights within interactive scenes to accommodate these design goals, because an interactive scene’s spatial and dramatic configuration, including mood, dramatic intensity, and the relative importance of different characters, change unpredictably in real- time. There are several techniques developed by the game industry that establish spectacular real-time lighting effects within 3-D interactive environments. These techniques are often time and labor intensive. In addition, they are not easily used to dynamically mold the visual design to convey communicative, dramatic, and aesthetic functions as addressed in creative disciplines, such as art, film, and theatre. In this paper, we present a new real-time lighting design model based on cinematic and theatric lighting design theory. The proposed model is designed to automatically, and in real-time, adjust lighting in an interactive scene accommodating the dramatic, aesthetic, and communicative functions described by traditional lighting design theories while accommodating artistic constraints concerning style, visual continuity, and aesthetic function.
Christianson96: Christianson, D., Anderson, S., He, L., Salesin, D., Weld, D., and Cohen M. (1996). Declarative Camera Control for Automatic Cinematography. Proceedings of AAAI-96.
Animations generated by interactive 3D computer graphics applications are typically portrayed either from a particular character's point of view or from a small set of strategically-placed viewpoints. By ignor- ing camera placement, such applications fail to realize important storytelling capabilities that have been ex- plored by cinematographers for many years. In this paper, we describe several of the principles of cinematography and show how they can be formal- ized into a declarative language, called the Declara- tive Camera Control Language (dccl). We describe the application of dccl within the context of a simple interactive video game and argue that dccl represents cinematic knowledge at the same level of abstraction as expert directors by encoding 16 idioms from a lm textbook. These idioms produce compelling anima- tions, as demonstrated on the accompanying video- tape.
He96: He, L., Cohen, M.F., and Salesin, D.H. (1996). The Virtual Cinematographer: A Paradigm for Automatic Real-Time Camera Control and Directing. Proceedings of SIGGRAPH '96, in Computer Graphics Proceeings, Annual Conference Series.
This paper presents a paradigm for automatically generating complete camera specifications for capturing events in virtual 3D environments in real-time. We describe a fully implemented system, called the Virtual Cinematographer, and demonstrate its application in a virtual "party" setting. Cinematographic expertise, in the form of film idioms, is encoded as a set of small hierarchically organized finite state machines. Each idiom is responsible for capturing a particular type of scene, such as three virtual actors conversing or one actor moving across the environment. The idiom selects shot types and the timing of transitions between shots to best communicate events as they unfold. A set of camera modules, shared by the idioms, is responsible for the low-level geometric placement of specific cameras for each shot. The camera modules are also responsible for making subtle changes in the virtual actors' positions to best frame each shot. In this paper, we discuss some basic heuristics of filmmaking and show how these ideas are encoded in the Virtual Cinematographer.
Thomlinson00: Thomlinson, B., Blumberg, B., and Nain, D. (2000). Expressive Autonomous Cinematography for Interactive Virtual Environments. Proceedings of the Fourth International Conference on Autonomous Agents.
We have created an automatic cinematography system for interactive virtual environments. This system controls a virtual camera and lights in a three-dimensional virtual world inhabited by a group of autonomous and user-controlled characters. By dynamically changing the camera and the lights, our system facilitates the interaction of human participants with this world and displays the emotional content of the digital scene. Building on the tradition of cinema, modern video games, and autonomous behavior systems, we have constructed this cinematography system with an ethologically-inspired structure of sensors, emotions, motivations, and action-selection mechanisms. Our system breaks shots into elements, such as which actors the camera should focus on or the angle it should use to watch them. Hierarchically arranged cross-exclusion groups mediate between the various options, arriving at the best shot at each moment in time. Our cinematography system uses the same approach that we use for our virtual actors. This eases the cross-over of information between them, and ultimately leads to a richer and more unified installation. As digital visualizations grow more complex, cinematography must keep pace with the new breeds of characters and scenarios. A behavior-based autonomous cinematography system is an effective tool in the creation of interesting virtual worlds. Our work takes first steps toward a future of interactive, emotional cinematography.
Bares99: Bares, W.H. and Lester, J.C. (1999). Intelligent Multi-shot Visualization Intefaces for Dynamic 3D Worlds. Proceedings of the 1999 International Conference on Intelligent User Interfaces.
In next-generation virtual 3D simulation, training, and entertainment environments, intelligent visualization interfaces must respond to user-specified viewing requests so users can follow salient points of the action and monitor the relative locations of objects. Users should be able to indicate which object(s) to view, how each should be viewed, cinematic style and pace, and how to respond when a single satisfactory view is not possible. When constraints fail, weak constraints can be relaxed or multi-shot solutions can be displayed in sequence or as composite shots with simultaneous viewports. To address these issues, we have developed CONSTRAINTCAM, a real-time camera visualization interface for dynamic 3D worlds. It has been studied in an interactive testbed in which users can issue viewing goals to monitor multiple autonomous characters navigating through a virtual cityscape. CONSTRAINTCAM's real-time performance in this testbed is encouraging.
Jhala06: Jhala, A.H. and Young, R.M. (2006). Representational Requirements for a Plan Based Approach to Automated Camera Control. Proceedings of the 2006 Conference on Artificial Intelligence for Interactive Digital Entertainment (AIIDE).
Automated camera control has been an active area of research for a number of years. The problem has been addressed in the Graphics, AI and Game communities from different perspectives. The main focus of the research in the Graphics community has been frame composition and coherence. The AI community has focused on intelligent shot selection, and the Games community strives for realtime cinematic camera control. While the proposed solutions in each of these fields are promising, there has not been much effort spent on listing out the requirements of an intelligent camera control system and how these can be satisfied through a combination of approaches taken from these different fields. This paper attempts to list out the representational requirements with a view of finding a unifying representation for combining these disparate approaches. We show how a plan based approach can capture some of these requirements and it can be connected to a geometric constraint solver for camera placement.
Elson07: Elson, D.K., and Riedl, M.O. A Lightweight Intelligent Virtual Cinematography System for Machinima Production. Proceedings of the 3rd Annual Conference on Artificial Intelligence and Interactive Digital Entertainment (AIIDE).
Machinima is a low-cost alternative to full production filmmaking. However, creating quality cinematic visualizations with existing machinima techniques still requires a high degree of talent and effort. We introduce a lightweight artificial intelligence system, Cambot, that can be used to assist in machinima production. Cambot takes a script as input and produces a cinematic visualization. Unlike other virtual cinematography systems, Cambot favors an offline algorithm coupled with an extensible library of specific modular and reusable facets of cinematic knowledge. One of the advantages of this approach to virtual cinematography is a tight coordination between the positions and movements of the camera and the actors.
Doyle97: Doyle, P. and Hayes-Roth, B. (1997). Agents in Annotated Worlds (Technical Report KSL 97-09). Stanford University, Knowledge Systems Laboratory.
Virtual worlds offer great potential as environments for education, entertainment, and collaborative work. Agents that function effectively in heterogeneous virtual spaces must have the ability to acquire new behaviors and useful semantic information from those contexts. The human-computer interaction literature discusses how to construct spaces and objects that provide "knowledge in the world" that aids human beings to perform these tasks. In this paper, we describe how to build comparable annotated environments containing explanations of the purpose and uses of spaces and activities that allow agents quickly to become intelligent actors in those spaces. Examples are provided from our application domain, believable agents acting as inhabitants and guides in a children's exploratory world.
Perlin96: Perlin, K. and Goldberg, A. (1996). Improv: A System for Scripting Interactive Actors in Virtual Worlds. Proceedings of the 23rd Annual Conference on Computer Graphics.
Improv is a system for the creation of real-time behavior-based animated actors. There have been several recent efforts to build network distributed autonomous agents. But in general these efforts do not focus on the author's view. To create rich interactive worlds inhabited by believable animated actors, authors need the proper tools. Improv provides tools to create actors that respond to users and to each other in real-time, with personalities and moods consistent with the author's goals and intentions. Improv consists of two subsystems. The first subsystem is an Animation Engine that uses procedural techniques to enable authors to create layered, continuous, non-repetitive motions and smooth transitions between them. The second subsystem is a Behavior Engine that enables authors to create sophisticated rules governing how actors communicate, change, and make decisions. The combined system provides an integrated set of tools for authoring the "minds" and "bodies" of interactive actors. The system uses an english'style scripting language so that creative experts who are not primarily programmers can create powerful interactive applications.
Blumberg95: Blumberg, B. and Galyean, T.A. (1995). Multi-Level Direction of Autonomous Creatures for Real-Time Virtual Environments. Proceedings of SIGGRAPH'95.
There have been several recent efforts to build behavior-based autonomous creatures. While competent autonomous action is highly desirable, there is an important need to integrate autonomy with “directability”. In this paper we discuss the problem of building autonomous animated creatures for interactive virtual environments which are also capable of being directed at multiple levels. We present an approach to control which allows an external entity to “direct” an autonomous creature at the motivational level, the task level, and the direct motor level. We also detail a layered architecture and a general behavioral model for perception and action-selection which incorporates explicit support for multi-level direction. These ideas have been implemented and used to develop several autonomous animated creatures.
Mateas97: Mateas, M. (1997). An Oz-centric review of interactive drama and believable agents. Technical Report CMU-CS-97-156, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA.
Believable agents are autonomous agents that exhibit rich personalities. Interactive dramas take place in virtual worlds inhabited by characters (believable agents) with whom an audience interacts. In the course of this interaction, the audience experiences a story (lives a plot arc). This report presents the research philosophy behind the Oz Project, a research group at CMU that has spent the last ten years studying believable agents and interactive drama. The report then surveys current work from an Oz perspective.
Nelson05: Nelson, M. and Mateas, M. (2005). Search-based drama management in the interactive fiction Anchorhead . Proceedings of the First Conference on Artificial Intelligence and Interactive Digital Entertainment.
Drama managers guide a user through a story experience by modifying the experience in reaction to the user's actions. Search-based drama management (SBDM) casts the dramamanagement problem as a search problem: Given a set of plot points, a set of actions the drama manager can take, and an evaluation of story quality, search can be used to optimize the user's experience. SBDM was first investigated by Peter Weyhrauch in 1997, but little explored since. We return to SBDM to investigate algorithmic and authorship issues, including the extension of SBDM to different kinds of stories, especially stories with subplots and multiple endings, and issues of scalability. In this paper we report on experiments applying SBDM to an abstract story search space based on the text-based interactive fiction Anchorhead. We describe the features employed by the story evaluation function, investigate design issues in the selection of a set of drama management actions, and report results for drama managed versus unmanaged stories for a simulated random user.
Cavazza02: Cavazza, M., Charles, F., and Mead, S. (2002). Planning Characters' Behaviour in Interactive Storytelling. Journal of Visualization and Computer Animation, 13, 121-131.
In this paper, we describe a method for implementing the behaviour of artificial actors in the context of interactive storytelling. We have developed a fully implemented prototype based on the Unreal Tournament game engine, and carried experiments with a simple sitcom-like scenario. We discuss the central role of artificial actors in interactive storytelling and how real-time generation of their behaviour participates in the creation of a dynamic storyline. We follow previous work describing the behaviour of artificial actors through AI planning formalisms, and adapt it to the context of narrative representation. In this context, the narrative equivalent of a character's behaviour consists in its role. The set of possible roles for a given actor is represented as a Hierarchical Task Network (HTN). The system uses HTN planning to dynamically generate the character roles, by interleaving planning and execution, which supports dynamic interaction between actors, as well as user intervention in the unfolding plot. Finally, we present several examples of short plots and situations generated by the system from the dynamic interaction of artificial actors.
Mateas02: Mateas, M. and Stern, A (2002). Architecture, authorial idioms and early observations of the interactive drama Facade. Technical report CMU-CS-02-198, School of Computer Science, Carnegie Mellon University.
Facade is an artificial intelligence-based art/research experiment in electronic narrative - an attempt to move beyond traditional branching or hyper-linked narrative to create a fully-realized, one-act interactive drama. Integrating an interdisciplinary set of artistic practices and artificial intelligence technologies, we are completing a three year collaboration to engineer a novel architecture for supporting emotional, interactive character behavior and drama-managed plot. Within this architecture we are building a dramatically interesting, real-time 3D virtual world inhabited by computer-controlled characters, in which the user experiences a story from a first-person perspective. Facade will be publicly released as a free download in 2003.
Magerko07: Magerko, B. Evaluating Preemptive Story Direction in the Interactive Drama Architecture. Journal of Game Development, 2(3).
In an author-centric interactive drama, both the player's decisions and the author's desires should coherently influence the player's individual story experience. Different player interactions with the system should yield different stories, just as different authored content would. By defining experiences covered by the authored content (the set of which I call a story space), the author is creating an artistic vision for the player to take part in. As opposed to having explicit choices for the player to choose from and constraining those choices, interactive drama attempts to offer the player a fluid, continuous dramatic experience, akin to taking part in an improvisational play where the player is the protagonist in the story (Kelso et al. 1993; Laurel 1986).
Riedl03: Riedl, M.O, Saretto, C.J., and Young, R.M. (2003). Managing interaction between users and agents in a multi-agent storytelling environment. In Proceedings of the Second International Conference on Autonomous Agents and Multiagent Systems.
This paper describes an approach for managing the interaction of human users with computer-controlled agents in an interactive narrative-oriented virtual environment. In these kinds of systems, the freedom of the user to perform whatever action she desires must be balanced with the preservation of the storyline used to control the system's characters. We describe a technique, narrative mediation, that exploits a plan-based model of narrative structure to manage and respond to users' actions inside a virtual world. We define two general classes of response to situations where users execute actions that interfere with story structure: accommodation and intervention. Finally, we specify an architecture that uses these definitions to monitor and automatically characterize user actions, and to compute and implement responses to unanticipated activity. The approach effectively integrates user action and system response into the unfolding narrative, providing for the balance between a user's sense of control within the story world and the user's sense of coherence of the overall narrative.
Meehan81: Meehan, J. (1981). Tale-Spin. In R.C. Schank and C.K. Riesbeck (Eds). Inside Computer Understanding (pp. 197-226). Lawrence Erlbaum Associates.
TALE-SPIN is a program that writes simple stories. It is easily distinguished from any of the "mechanical" devices one can use for writing stories, such as filling in slots in a canned frame. The goal behind the writing of TALE-SPIN was to find out what kinds of knowledge were needed in story generation. the writing of TALE-SPIN embodied the traditional AI cycle of research. Step 1 was to define a theory. Step 2 was to write a program modeling that theory and to add it to the existing system. Step 3 was to run the system and to observe where the model was incorrect or inadequate, thereby identifying the need for some more theory.
Lebowitz84: Lebowitz, M. (1984). Creating characters in a story-telling universe. Poetics, 13, 171-194.
Extended story generation, i.e., the creation of continuing serials, presents difficult and interesting problems for Artificial Intelligence. We present here the first phase of the development of a program, UNIVERSE, that will ultimately tell extended stories. In particular, after descri inb our overall model of story telling, we present a method for creating universes of characters appropriate for extended story generation. This method concentrates on the need to keep story-telling unverses consistent and coherent. We also describe the information that must be maintained for characters and interpersonal relationships, and the use of stereotypical information about people to help motivate trait values. The use of historical events for motivation is also described. Finally, we present an example of a character generated by UNIVERSE.
Lebowitz85: Lebowitz, M. (1985). Story-telling as planning and learning. Poetics, 14, 483-502.
The generation of extended plots for melodramatic fiction is an interesting task for Artificial Intelligece research, one that requires the application of genralization techniques to carry out fully. UNIVERSE is a story-telling program that uses plan-like units, 'plot fragments', to generate plot outlines. By using a rich library of plot fragments and a well-developed set of characters. UNIVERSE can create a wide range of plot outlines. In this paper we illustrate how UNIVERSE's plot gramgent library is used to create plot outlines and how it might be automatically extended using explanation-based generalization methods. Our methods are based on analysis of a television melodrama, including comparisons of similar stories.
Perez01: Perez y Perez, R. and Sharples, M. (2001). MEXICA: a computer model of a cognitive account of creative writing. Journal of Experimental and Theoretical Artificial Intelligence, 13, 119-139.
MEXICA is a computer model that produces frameworks for short stories based on the engagement-refelction cognitive account of writing. During engagement MEXICA generates material guided by content and rhetorical constraints, avoiding the use of explicit goals or story-structure information. During reflection the system breaks impasses, evaluates the novelty and interestingness of the story in progress and verifies that coherence requirements are satisfied. In this way, MEXICA complements and extends those models of computerised story-telling based on traditional problem-solving techniques where explicit goals drive the generation of stories. This paper describes the engagement-reflection account of writing, the general characteristics of MEXICA and reports an evaluation of the program.
Riedl04: Mark O. Riedl and R. Michael Young. (2004) An Intent-Driven Planner for Multi-Agent Story Generation. Proceedings of the 3rd International Joint Conference on Autonomous Agents and Multi Agent Systems
The ability to generate narrative is of importance to computer systems that wish to use story effectively for entertainment, training, or education. We identify two properties of story – plot coherence and character believability – which play a role in the success of a story. Plot coherence is the perception by audience members that character actions have relevance to the outcome of the story. Character believability is the perception that character actions are motivated by agents' internal beliefs and desires. Unlike conventional planning in which plan goals represent an agent's intended world state, multiagent story planning involves goals that represent the outcome of a story. In order for the plans' actions to appear believable, multi-agent story planners must determine not only how agents' actions achieve a story's goal state, but must also ensure that each agent appears to be acting intentionally. We present a narrative generation planning system for multi-agent stories that is capable of generating narratives with both strong plot coherence and strong character believability. The planning algorithm uses causal reasoning and a simulated intention recognition process to drive plan creation.
Picard97: Picard, R. (19997). Affective Computing. Cambridge: MIT Press.
This part of the book addresses technical issues involved in creating affective computers, specifically, how to build sustems with the ability to recognize, express, and "have" emotions. This chapter and the two that follow will propose several building blocks that can be used to start filling in the framework of affective computing. I will also show how several examples from the literature can be woven into this new framework.
Gratch04: Gratch, J. and Marsella, S. (2004). A Domain-independent Framework for Modeling Emotion. Journal of Cognitive Systems Research, 5(4), 269-306.
In this article, we show how psychological theories of emotion shed light on the interaction between emotion and cognition, and thus can inform the design of human-like autonomous agents that must convey these core aspects of human behavior. We lay out a general computational framework of appraisal and coping as a central organizing principle for such systems. We then discuss a detailed domain-independent model based on this framework, illustrating how it has been applied to the problem of generating behavior for a significant social training application. The model is useful not only for deriving emotional state, but also for informing a number of the behaviors that must be modeled by virtual humans such as facial expressions, dialogue management, planning, reacting, and social understanding. Thus, the work is of potential interest to models of strategic decision-making, action selection, facial animation, and social intelligence.
El-Nasr00: Seif El-Nasr, M., Yen, J., and Ioerger, T.R. (2000). FLAME -- Fuzzy Logic Adaptive Model of Emotions. Autonomous Agents and Multi-Agent Systems, 3.
Emotions are an important aspect of human intelligence and have been shown to play a signi cant role in the human decision-making process. Researchers in areas such as cognitive science, philosophy, and arti cial intelligence have proposed a variety of models of emotions. Most of the previous models focus on an agent's reactive behavior, for which they often generate emotions according to static rules or pre-determined domain knowledge. However, throughout the history of research on emotions, memory and experience have been emphasized to have a major influence on the emotional process. In this paper, we propose a new computational model of emotions that can be incorporated into intelligent agents and other complex, interactive programs. The model uses a fuzzy-logic representation to map events and observations to emotional states. The model also includes several inductive learning algorithms for learning patterns of events, associations among objects, and expectations. We demonstrate empirically through a computer simulation of a pet that the adaptive components of the model are crucial to users' assessments of the believability of the agent's interactions.
Havasi07: Catherine Havasi, Rob Speer and Jason Alonso (2007). ConceptNet 3: a Flexible, Multilingual Semantic Network for Common Sense Knowledge. Proceedings of Recent Advances in Natural Languge Processing.
The Open Mind Common Sense project has been collecting common-sense knowledge from volunteers on the Internet since 2000. This knowledge is represented in a machine-interpretable semantic network called ConceptNet. We present ConceptNet 3, which improves the acquisition of new knowledge in ConceptNet and facilitates turning edges of the network back into natural language. We show how its modular design helps it adapt to different data sets and languages. Finally, we evaluate the content of ConceptNet 3, showing that the information it contains is comparable with WordNet and the Brandeis Semantic Ontology.
Mueller07: Mueller, E.T. (2007). Modelling Space and Time in Narratives about Restaurants Literary and Linguistic Computing Vol. 22, No. 1.
This study investigated the automatic modelling of space and time in narratives involving dining in a restaurant. We built a program that (1) uses information extraction techniques to convert narrative texts into templates containing key information about the dining episodes discussed in the narratives, (2) constructs commonsense reasoning problems from the templates, (3) uses commonsense reasoning and a commonsense knowledge base to build models of the dining episodes, and (4) generates and answers questions by consulting the models. We describe the program and present the results of running it on a corpus of web texts and American literature.
Marsella05: Marsella, Stacy C. and Pynadath, David V. Modeling influence and theory of mind. Artificial Intelligence and the Simulation of Behavior, 2005.
Agent-based modeling of human social behavior is an increasingly important research area. For example, such modeling is critical in the design of virtual humans, human-like autonomous agents that interact with people in virtual worlds. A key factor in human social interaction is our beliefs about others, in particular a theory of mind. Whether we believe a message depends not only on its content but also on our model of the communicator. The actions we take are influenced by how we believe others will react. In this paper, we present PsychSim, an implemented multiagent-based simulation tool for modeling interactions and influence among groups or individuals. Each agent has its own decision-theoretic model of the world, including beliefs about its environment and recursive models of other agents. Having thus given the agents a theory of mind, PsychSim also provides them with a psychologically motivated mechanism for updating their beliefs in response to actions and messages of others. We discuss PsychSim's architecture and its application to a school violence scenario.
Isbell06: Isbell, C., Kearns, M., Singh, S., Shelton, C., Stone, P., and Kormann, D. (2006). Cobot in LambdaMOO: An adaptive social statistics agent. Autonomous Agent Multi-Agent Systems, 13, 327-354.
We describe our development of Cobot, a novel software agent who lives in LambdaMOO, a popular virtual world frequented by hundreds of users. Cobot’s goal was to become an actual part of that community. Here, we present a detailed discussion of the functionality that made him one of the objects most frequently interacted with in LambdaMOO, human or artificial. Cobot’s fundamental power is that he has the ability to collect social statistics summarizing the quantity and quality of interpersonal interactions. Initially, Cobot acted as little more than a reporter of this information; however, as he collected more and more data, he was able to use these statistics as models that allowed him to modify his own behavior. In particular, cobot is able to use this data to “self-program,” learning the proper way to respond to the actions of individual users, by observing how others interact with one another. Further, Cobot uses reinforcement learning to proactively take action in this complex social environment, and adapts his behavior based on multiple sources of human reward. Cobot represents a unique experiment in building adaptive agents who must live in and navigate social spaces.
Lau99: Lau, T. and Weld, D. (1999). Programming by Demonstration: An Inductive Learning Formulation. Proceedings of the 1999 International Conference on Intelligent User Interfaces (IUI).
Although Programming by Demonstration (PBD) has the potential to improve the productivity of unsophisticated users, previous PBD systems have used brittle, heuristic, domain-specific approaches to execution-trace generaliziation. In this paper we define two application-independent methods for performing generalization that are based on well-understood machine learning technology. TGENVS uses version-space generalization, and TGENFOIL is based on the FOIL inductive logic programming algorithm. We analyze each method both theoretically and empirically, arguing that TGENVS has lower sample complexity, but TGENFOIL can learn a much more interesting class of programs.
Thomaz08: Thomaz, A. and Breazeal, C. (2008). Teachable robots: Understanding human teaching behavior to build more effective robot learners. Artificial Intelligence, 172, 716-737.
While Reinforcement Learning (RL) is not traditionally designed for interactive supervisory input from a human teacher, several works in both robot and software agents have adapted it for human input by letting a human trainer control the reward signal. In this work, we experimentally examine the assumption underlying these works, namely that the human-given reward is compatible with the traditional RL reward signal. We describe an experimental platform with a simulated RL robot and present an analysis of real-time human teaching behavior found in a study in which untrained subjects taught the robot to perform a new task. We report three main observations on how people administer feedback when teaching a Reinforcement Learning agent: (a) they use the reward channel not only for feedback, but also for future-directed guidance; (b) they have a positive bias to their feedback, possibly using the signal as a motivational channel; and (c) they change their behavior as they develop a mental model of the robotic learner. Given this, we made specific modifications to the simulated RL robot, and analyzed and evaluated its learning behavior in four follow-up experiments with human trainers. We report significant improvements on several learning measures. This work demonstrates the importance of understanding the human-teacher/robot-learner partnership in order to design algorithms that support how people want to teach and simultaneously improve the robot's learning behavior.
Maybury92: Maybury, M. (1992). Communicative acts for explanation generation. Int. J. Man- Machine Studies (1992) 37, 135-172
Knowledge-based systems that interact with humans often need to define their terminology, elucidate their behavior or support their recommendations or conclusions. In general, they need to explain themselves. Unfortunately, current computer systems, if they can explain themselves at ail, often generate explanations that are unnatural, ill-connected or simply incoherent. They typically have only one method of explanation which does not allow them to recover from failed communication. At a minimum, this can irritate an end-user and potentially decrease their productivity. More dangerous, poorly conveyed information may result in misconceptions on the part of the user which can lead to bad decisions or invalid conclusions, which may have costly or even dangerous implications.

To address this problem, we analyse human-produced explanations with the aim of transferring explanation expertise to machines. Guided by this analysis, we present a classification of explanatory utterances based on their content and communicative function. We then use these utterance classes and additional text analysis to construct a taxonomy of text types. This text taxonomy characterizes multisentence explanations according to the content they convey, the communicative acts they perform, and their intended effect on the addressee’s knowledge, beliefs, goals and plans. We then argue that the act of explanation presentation is an action-based endeavor and introduce and define an integrated theory of communicative acts (rhetorical, illocutionary, and locutionary acts). To illustrate this theory we formalize several of these communicative acts as plan operators and then show their use by a hierarchical text planner (TEXPLAN-Textual Explanation PLANner) that composes natural language explanations. Finally, we classify a range of reactions readers may have to explanations and illustrate how a system can respond to these given a plan-based approach. Our research thus contributes (1) a domain-independent taxonomy of abstract explanatory utterances, (2) a taxonomy of multisentence explanations based on these utterance classes and (3) a classification of reactions readers may have to explanations as well as (4) an illustration of how these classifications can be applied computationally.
Mateas04: Mateas, M. and Stern, A. (2004). Natural Language Understanding in Facade: Surface-text Processing. Proceedings of the 2004 International Conference on Technologies for Interactive Digital Storytelling and Entertainment (TIDSE).
Facade is a real-time, first-person dramatic world in which the player, visiting the married couple Grace and Trip at their apartment, quickly becomes entangled in the high-conflict dissolution of their marriage. The Facade interactive drama integrates real-time, autonomous believable agents, drama management for coordinating plot-level interactivity, and broad, shallow support for natural language understanding and discourse management. In previous papers, we have described the motivation for Facade's interaction design and architecture [13, 14], described ABL, our believable agent language [9, 12], and presented overviews of the entire architecture [10, 11]. In this paper we focus on Facade's natural language processing (NLP) system, specifically the understanding (NLU) portion that extracts discourse acts from player-typed surface text.
Albrecht98: Albrecht, D., Zukerman, I., Nichols, A. (1998). Bayesian Models for Keyhole Plan Recognition in an Adventure Game. User Modeling and User Adapted Interactions, 45.
We present an approach to keyhole plan recognition which uses a dynamic belief (Bayesian) network to represent features of the domain that are needed to identify users’ plans and goals. The application domain is a MultiUser Dungeon adventure game with thousands of possible actions and locations. We propose several network structures which represent the relations in the domain to varying extents, and compare their predictive power for predicting a user’s current goal, next action and next location. The conditional probability distributions for each network are learned during a training phase, which dynamically builds these probabilities from observations of user behaviour. This approach allows the use of incomplete, sparse and noisy data during both training and testing. We then apply simple abstraction and learning techniques in order to speed up the performance of the most promising dynamic belief networks without a significant change in the accuracy of goal predictions. Our experimental results in the application domain show a high degree of predictive accuracy. This indicates that dynamic belief networks in general show promise for predicting a variety of behaviours in domains which have similar features to those of our domain, while reduced models, obtained by means of learning and abstraction, show promise for efficient goal prediction in such domains.
Carberry88: Carberry, S. (1988). Modeling the User's Plans and Goals. Computational Linguistics, 14(3), 23-37.
This work is an ongoing research effort aimed both at developing techniques for inferring and constructing a user model from an information-seeking dialog and at identifying strategies for applying this model to enhance robust communication. One of the most important components of a user model is a representation of the system's beliefs about the underlying task-related plan motivating an information-seeker's queries. These beliefs can be used to interpret subsequent utterances and produce useful responses. This paper describes the IREPS system, emphasizing its dynamic construction of the task-related plan motivating the information-seeker's queries and the application of this component of a user model to handling utterances that violate the pragmatic rules of the system's world model. By reasoning on a model of the user's plans and goals, the system often can deduce the intended meaning of faulty utterances and allow the dialogue to continue without interruption. Some limitations of current plan inference systems are discussed. It is suggested that the problem of detecting and recovering from discrepancies between the system's model of the user's plan and the actual plan under construction by the user requires an enriched model that differentiates among its components on the basis of the support the system accords each component as a correct and intended part of the user's plan.
Lesh98: Lesh, N., Rich, C., and Sidner, C. (1998). Using Plan Recognition in Human- Computer Collaboration (Technical Report MERL-TR-98-23). Mitsubishi Electric Research Laboratory.
Human-computer collaboration provides a practical and useful application for plan recognition techniques. We describe a plan recognition algorithm which is tractable by virtue of exploiting properties of the collaborative setting, namely: the focus of attention, the use of partially elaborated hierarchical plans, and the possibility of asking for clari cation. We demonstrate how the addition of our plan recognition algorithm to an implemented collaborative system reduces the amount of communication required from the user.