Research Interests

Scalable Solutions to MDPs

Markov Decision Processes (MDPs) provide an expressive and powerful model for formalizing a variety of sequential decision problems. Solving large and complex MDPs however, remains a challenge. We aim to find scalable solutions to MDPs. We tackle a variety of problems towards this goal including: bootstrap efficiency, stable function approximation, automatic problem decomposition/reformulation (this overlaps with concept discovery work), and ways of leveraging human help, particularly interactive human help.

Concept Discovery

Reasoning approaches can solve complex and difficult problems, but there is no free lunch; the cost lies in the engineering of the right representation and domain knowledge. Concept discovery is the task of finding those representations and domain concepts for the purposes of enabling a decision making system. Not only do correct concepts improve performance, but they enable transfer of learned skills to new tasks and new domains.

We are particularly interested in concept discovery in the reinforcement learning (RL) context where this interest has led us to learning hierarchical structure. We focus on single-agent Markov domains such as flight simulators, casual games such as Fort, and real time strategy games.

Ensemble Learning

Ensemble techniques are interesting because they allow us to leverage a variety of different learners together. This implies the ability to leverage different models (i.e. different parametric models as well as nonparametric techniques), different sets of domain knowledge and different machine learning techniques. Furthermore, ensemble techniques are powerful. Recent empirical results show that ensemble techniques consistently out perform any single learner. In our research, we extend Boosting, a well known high performance ensemble learning technique, to handle multiple weak learners using the Boosting framework itself while controlling for weak learner overfitting. This extension enables us to boost over different models and leverage different sets of domain knowledge, all in a regularized fashion. In MBoost, we find that boosting different models together is a viable and often better performing alternative to model selection (e.g. via cross validation).

Interactive Learning

Interactive learning and related work in active learning is an evergreen research interest of mine. Any time there is a scarcity of examples, you must make the most out of those examples that you can get. Active learning is all about optimizing the problem of getting the right examples to maximize learning. I am particularly interested in the case when the source of examples is the user, making it an interactive learning problem. This fits well with my overall philosophy that AIs are, in the end, built for humans and as such human-AI interfacing and interaction is key.

Previous Research

Online Behavior Adaptation

"AI" agents in games are often no more than a set of static scripts (behaviors). While behavior generation and adaptation is equivalent to automatic programming and undecidable in general, limited local search for "fixing" behaviors is possible. We explore a transformational planning approach to behavior adaptation and show how it can be applied to behavior sets as a means for achieve online agent adaptation. Preliminary experiments show the system can be effective in improving agent behavior in situations where original static behaviors fail.

Social Network Analysis

Analyzing social groups and their interactions, particularly to identifying opinion leaders, communication carriers and blockers. To aid in this analysis, we also pursued named entity extraction and particularly, their relationships.

Text Mining

  • While there are now significant bodies of research on text mining, relatively little has been dedicated to automated comparison of multi-collection documents. This is an extremely useful function. Allowing one, for example, to compare and contrast different news sources on their coverage of some event. An automated comparator should be able to pick out common themes between two collections of documents and points of biases that one collection might have over another. Our approach casts the collections as a mixture of language models making the compare/contrast process a clustering problem of finding the original language models.

  • Papers submitted to many academic conferences must be stripped of authors' names for double-blind peer review. All too often, names are left on and these instances must be detected and flagged for manual removal. This process is complicated by the common use of anonymous or fake names such as "John Doe" in place of stripping the names. Our approach is to combine entity extraction techniques to identify names with name disambiguation using Google's search mechanisms.

Network Intrusion Detection

Intrusion detection systems (IDS) can be significantly improved if we consider temporal features as attacks are temporal in nature. We investigate the effectiveness of a temporal IDS, MADAMID, developed by Wenke Lee. We build temporal features for MADAMID and explore its effectiveness and performance as an integrated part of an overall intrusion detection system.