I am an Assistant Professor in the School of Interactive Computing within the College of Computing at Georgia Tech. I direct the Georgia Tech Robot Learning Lab, which is affiliated with both the Center for Machine Learning and the Institute for Robotics and Intelligent Machines.

My group performs fundamental and applied research in machine learning, artificial intelligence, and robotics with a focus on developing theory and systems that tightly integrate perception, learning, and control. Our work touches on a range of problems including computer vision, system identification, forecasting, simultaneous localization & mapping, motion planning, and optimal control. The algorithms that we develop often use and extend theory from nonparametric statistics, neural networks, non-convex optimization, matrix algebra, and dynamic programming. (See Google Scholar and Publications below).

Prior to joining Georgia Tech, I was a post-doc in the Robotics and State Estimation Lab directed by Dieter Fox in the Computer Science and Engineering Department at the University of Washington. I received my Ph.D. from the Machine Learning Department in the School of Computer Science at Carnegie Mellon University where I was a member of the Sense, Learn, Act (SELECT) Lab co-directed by Carlos Guestrin and my advisor Geoff Gordon.


Teaching
Workshops & Tutorials
 

I have co-organized the following workshops and tutorials:

  • The ICRA Workshop on Sensorimotor Learning, May 26-30, 2015, Seattle, Washington.
  • The ICML Workshop on Method of Moments and Spectral Learning, June 25th-26th, 2014, Beijing, China.
  • The NIPS Workshop on Spectral Learning, December 10th, 2013, Lake Tahoe, Nevada.
  • The AAAI Workshop on Intelligent Robotic Systems, July 14th-15th 2013, Bellevue, Washington.
  • The R:SS Workshop on Inverse Optimal Control & Robot Learning from Demonstration, June 27th, 2013, Berlin, Germany.
  • The ICML Workshop on Spectral Learning, June 20th-21st, 2013, Atlanta, Georgia.
  • The AAAI Spring Symposium Designing Intelligent Robots: Reintegrating AI II, March 25-27 2013, at Stanford University.
  • The ICML Tutorial on Spectral Approaches to Learning Latent Variable Models, June 26th 2012, in Edinburgh, Scotland.
  • The AAAI Spring Symposium Designing Intelligent Robots: Reintegrating AI, March 26-28 2012, at Stanford University.

  • Refereed Conference & Journal Publications
    Authors Title Year Journal/Proceedings
    C. Cheng & B. Boots. Incremental Learning for Variational Sparse Gaussian Process Regression. (22% Acceptance Rate) 2016 Proceedings of Advances in Neural Information Processing Systems 30 (NIPS-2016)
    Abstract: Recent work on scaling up Gaussian process regression (GPR) to large datasets has primarily focused on sparse GPR, which leverages a small set of basis functions to approximate the full Gaussian process during inference. However, the majority of these approaches are batch methods that operate on the entire training dataset at once, precluding the use of datasets that are streaming or too large to fit into memory. Although previous work has considered incrementally solving variational sparse GPR, most algorithms fail to update the basis functions and therefore perform suboptimally. We propose a novel incremental learning algorithm for variational sparse GPR based on stochastic mirror ascent of probability densities in reproducing kernel Hilbert space. This new formulation allows our algorithm to update basis functions online in accordance with the manifold structure of probability densities for fast convergence. We conduct several experiments and show that our proposed approach achieves better empirical performance in terms of prediction error than the recent state-of-the-art incremental solutions to sparse GPR.
    J. Tan, Z. Xie, B. Boots, & K. Liu. Simulation-Based Design of Dynamic Controllers for Humanoid Balancing. (Selected for oral presentation) (48% Acceptance Rate) 2016 Proceedings of the 2016 International Conference on Intelligent Robots and Systems (IROS-2016)
    Abstract: Model-based trajectory optimization often fails to find a reference trajectory for under-actuated bipedal robots performing highly-dynamic, contact-rich tasks in the real world due to inaccurate physical models. In this paper, we propose a complete system that automatically designs a reference trajectory that succeeds on tasks in the real world with a very small number of real world experiments. We adopt existing system identification techniques and show that, with appropriate model parameterization and control optimization, an iterative system identification framework can be effective for designing reference trajectories. We focus on a set of tasks that leverage the momentum transfer strategy to rapidly change the whole-body from an initial configuration to a target configuration by generating large accelerations at the center of mass and switching contacts.
    BibTeX:
    @inproceedings{Tan-IROS-16,
    Author = "Jie Tan and Zhaoming Xie and Byron Boots and C. Karen Liu", booktitle = {Proceedings of The IEEE Conference on Intelligent Robots and Systems (IROS-2016)},
    Title = "Simulation-Based Design of Dynamic Controllers for Humanoid Balancing",
    year = {2016}
    }
    W. Sun, R. Capobianco, G. J. Gordon, J. A. Bagnell, & B. Boots. Learning to Smooth with Bidirectional Predictive State Inference Machines. (31% Acceptance Rate) 2016 Proceedings of the 32nd Conference on Uncertainty in Artificial Intelligence (UAI-2016)
    Abstract: We present the Smoothing Machine (SMACH, pronounced "smash"), a dynamical system learning algorithm based on chain Conditional Random Fields (CRFs) with latent states. Unlike previous methods, SMACH is designed to optimize prediction performance when we have information from both past and future observations. By leveraging Predictive State Representations (PSRs), we model beliefs about latent states through predictive states--an alternative but equivalent representation that depends directly on observable quantities. Predictive states enable the use of well-developed supervised learning approaches in place of local-optimum-prone methods like EM: we learn regressors or classifiers that can approximate message passing and marginalization in the space of predictive states. We provide theoretical guarantees on smoothing performance and we empirically verify the efficacy of SMACH on several dynamical system benchmarks.
    BibTeX:
    @inproceedings{Sun-UAI-16,
    Author = "Wen Sun and Roberto Capobianco and Geoffrey J. Gordon and J. Andrew Bagnell and Byron Boots", booktitle = {Proceedings of The International Conference on Uncertainty in Artificial Intelligence (UAI-2016)},
    Title = "Learning to Smooth with Bidirectional Predictive State Inference Machines",
    year = {2016}
    }
    J. Dong, M. Mukadam, F. Dellaert, & B. Boots. Motion Planning as Probabilistic Inference using Gaussian Processes and Factor Graphs.
    (20% Acceptance Rate)
    2016 Proceedings of Robotics: Science and Systems XII (RSS-2016)
    Abstract: With the increased use of high degree-of-freedom robots that must perform tasks in real-time, there is a need for fast algorithms for motion planning. In this work, we view motion planning from a probabilistic perspective. We consider smooth continuous-time trajectories as samples from a Gaussian process (GP) and formulate the planning problem as probabilistic inference. We use factor graphs and numerical optimization to perform inference quickly, and we show how GP interpolation can further increase the speed of the algorithm. Our framework also allows us to incrementally update the solution of the planning problem to contend with changing conditions. We benchmark our algorithm against several recent trajectory optimization algorithms on planning problems in multiple environments. Our evaluation reveals that our approach is several times faster than previous algorithms while retaining robustness. Finally, we demonstrate the incremental version of our algorithm on replanning problems, and show that it often can find successful solutions in a fraction of the time required to replan from scratch.
    BibTeX:
    @inproceedings{Dong-RSS-16,
    Author = "Jing Dong and Mustafa Mukadam and Frank Dellaert and Byron Boots", booktitle = {Proceedings of Robotics: Science and Systems (RSS-2016)},
    Title = "Motion Planning as Probabilistic Inference using Gaussian Processes and Factor Graphs",
    year = {2016}
    }
    Z. Marinho, B. Boots, A. Dragan, A. Byravan, G. J. Gordon, & S. Srinivasa. Functional Gradient Motion Planning in Reproducing Kernel Hilbert Spaces. (20% Acceptance Rate) 2016 Proceedings of Robotics: Science and Systems XII (RSS-2016)
    Abstract: We introduce a functional gradient descent trajectory optimization algorithm for robot motion planning in Reproducing Kernel Hilbert Spaces (RKHSs). Functional gradient algorithms are a popular choice for motion planning in complex many-degree-of-freedom robots, since they (in theory) work by directly optimizing within a space of continuous trajectories to avoid obstacles while maintaining geometric properties such as smoothness. However, in practice, implementations such as CHOMP and TrajOpt typically commit to a fixed, finite parametrization of trajectories, often as a sequence of waypoints. Such a parameterization can lose much of the benefit of reasoning in a continuous trajectory space: e.g., it can require taking an inconveniently small step size and large number of iterations to maintain smoothness. Our work generalizes functional gradient trajectory optimization by formulating it as minimization of a cost functional in an RKHS. This generalization lets us represent trajectories as linear combinations of kernel functions, without any need for waypoints. As a result, we are able to take larger steps and achieve a locally optimal trajectory in just a few iterations. Depending on the selection of kernel, we can directly optimize in spaces of trajectories that are inherently smooth in velocity, jerk, curvature, etc., and that have a low-dimensional, adaptively chosen parameterization. Our experiments illustrate the effectiveness of the planner for different kernels, including Gaussian RBFs, Laplacian RBFs, and B-splines, as compared to the standard discretized waypoint representation.
    BibTeX:
    @inproceedings{Marinho-RSS-16,
    Author = "Zita Marinho and Anca Dragan and Arunkumar Byravan and Byron Boots and Geoffrey J. Gordon and Siddhartha Srinivasa", booktitle = {Proceedings of Robotics: Science and Systems (RSS-2016)},
    Title = "Functional Gradient Motion Planning in Reproducing Kernel Hilbert Spaces",
    year = {2016}
    }
    A. Venkatraman, W. Sun, M. Hebert, B. Boots, & J. A. Bagnell. Inference Machines for Nonparametric Filter Learning.
    (24% Acceptance Rate)
    2016 Proceedings of the 25th International Joint Conference on Artificial Intelligence (IJCAI-2016)
    Abstract: Data-driven approaches for learning dynamic models for Bayesian filtering often try to maximize the data likelihood given parametric forms for the transition and observation models. However, this objective is usually nonconvex in the parametrization and can only be locally optimized. Furthermore, learning algorithms typically do not provide performance guarantees on the desired Bayesian filtering task. In this work, we propose using inference machines to directly optimize the filtering performance. Our procedure is capable of learning partially-observable systems when the state space is either unknown or known in advance. To accomplish this, we adapt Predictive State Inference Machines (PSIMS) by introducing the concept of hints, which incorporate prior knowledge of the state space to accompany the predictive state representation. This allows PSIM to be applied to the larger class of filtering problems which require prediction of a specific parameter or partial component of state. Our PSIM+HINTS adaptation enjoys theoretical advantages similar to the original PSIM algorithm, and we showcase its performance on a variety of robotics filtering problems.
    BibTeX:
    @inproceedings{Venkatraman-IJCAI-16,
    Author = "Arun Venkatraman and Wen Sun and Martial Hebert and Byron Boots and J. Andrew Bagnell", booktitle = {Proceedings of the 2016 International Joint Conference on Artificial Intelligence (IJCAI-2016)},
    Title = "Inference Machines for Nonparametric Filter Learning.",
    year = {2016}
    }
    W. Sun, A. Venkatraman, B. Boots, & J. A. Bagnell. Learning to Filter With Predictive State Inference Machines. (24% Acceptance Rate) 2016 Proceedings of the 33rd International Conference on Machine Learning (ICML-2016)
    Abstract: Latent state space models are a fundamental and widely used tool for modeling dynamical systems. However, they are difficult to learn from data and learned models often lack performance guarantees on inference tasks such as filtering and prediction. In this work, we present the Predictive State Inference Machine (PSIM), a data-driven method that considers the inference procedure on a dynamical system as a composition of predictors. The key idea is that rather than first learning a latent state space model, and then using the learned model for inference, PSIM directly learns predictors for inference in predictive state space. We provide theoretical guarantees for inference, in both realizable and agnostic settings, and showcase practical performance on a variety of simulated and real world robotics benchmarks.
    BibTeX:
    @inproceedings{Sun-ICML-16,
    Author = "Wen Sun and Arun Venkatraman and Byron Boots and J. Andrew Bagnell", booktitle = {Proceedings of the 2016 International Conference on Machine Learning (ICML-2016)},
    Title = "Learning to Filter with Predictive State Inference Machines",
    year = {2016}
    }
    M. Mukadam, X. Yan, & B. Boots. Gaussian Process Motion Planning.
    (34% Acceptance Rate)
    2016 Proceedings of the 2016 IEEE Conference on Robotics and Automation (ICRA-2016)
    Abstract: Motion planning is a fundamental tool in robotics, used to generate collision-free, smooth, trajectories, while satisfying task-dependent constraints. In this paper, we present a novel approach to motion planning using Gaussian processes. In contrast to most existing trajectory optimization algorithms, which rely on a discrete waypoint parameterization in practice, we represent the continuous-time trajectory as a sample from a Gaussian process (GP) generated by a linear time varying stochastic differential equation. We then provide a gradient-based optimization technique that optimizes continuous-time trajectories with respect to a cost functional. By exploiting GP interpolation, we develop the Gaussian Process Motion Planner (GPMP), that finds optimal trajectories parameterized by a small number of waypoints. We benchmark our algorithm against recent trajectory optimization algorithms by solving 7-DOF robotic arm planning problems in simulation and validate our approach on a real 7-DOF WAM arm.
    BibTeX:
    @inproceedings{Mukadam-ICRA-16,
    Author = "Mustafa Mukadam and Xinyan Yan and Byron Boots", booktitle = {Proceedings of the 2016 IEEE Conference on Robotics and Automation (ICRA-2016)},
    Title = "Gaussian Process Motion Planning.",
    year = {2016}
    }
    Y. Nishiyama, A. Afsharinejad, S. Naruse, B. Boots, & L. Song. The Nonparametric Kernel Bayes Smoother.
    (30% Acceptance Rate)
    2016 Proceedings of the 19th International Conference on Artificial Intelligence and Statistics (AISTATS-2016)
    Abstract: Recently, significant progress has been made on developing kernel mean expressions for Bayesian inference. An important success in this domain is the nonparametric kernel Bayes filter (nKB-filter), which can be used for sequential inference in state space models. We expand upon this work by introducing a smoothing algorithm, the nonparametric kernel Bayes smoother (nKB-smoother), which relies on kernel Bayesian inference through the kernel sum rule and kernel Bayes rule. We derive the smoothing equations, analyze the computational cost, and show smoothing consistency. We summarize the algorithm, which is simple to implement, requiring only matrix multiplications and the output of nKB-filter. Finally, we report experimental results that compare the nKBsmoother to previous parametric and nonparametric approaches to Bayesian filtering and smoothing. In the supplementary materials, we show that the combination of nKB-filter and nKB-smoother allows marginal kernel mean computation, which gives an alternative to the kernel belief propagation.
    BibTeX:
    @inproceedings{Nishiyama-AISTATS-16,
    Author = "Yu Nishiyama and Amir Hossein Afsharinejad and Shunsuke Naruse and Byron Boots and Le Song", booktitle = {Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS-2016)},
    Title = "The Nonparametric Kernel {B}ayes Smoother.",
    year = {2016}
    }
    A. Venkatraman, W. Sun, M. Hebert, J. A. Bagnell, & B. Boots. Online Instrumental Variable Regression with Applications to Online Linear System Identification.
    (26% Acceptance Rate)
    2016 Proceedings of the 30th Conference on Artificial Intelligence (AAAI-2016)
    Abstract: Instrumental variable regression (IVR) is a statistical technique utilized for recovering unbiased estimators when there are errors in the independent variables. Estimator bias in learned time series models can yield poor performance in applications such as long-term prediction and filtering where the recursive use of the model results in the accumulation of propagated error. However, prior work addressed the IVR objective in the batch setting, where it is necessary to store the entire dataset in memory - an infeasible requirement in large dataset scenarios. In this work, we develop Online Instrumental Variable Regression (OIVR), an algorithm that is capable of updating the learned estimator with streaming data. We show that the online adaptation of IVR enjoys a no-regret performance guarantee with respect the original batch setting by taking advantage of any no-regret online learning algorithm inside OIVR for the underlying update steps. We experimentally demonstrate the efficacy of our algorithm in combination with popular no-regret online algorithms for the task of learning predictive dynamical system models and on a prototypical econometrics instrumental variable regression problem.
    BibTeX:
    @inproceedings{Venkatraman-AAAI-16,
    Author = "Arun Venkatraman and Wen Sun and Martial Hebert and J. Andrew Bagnell and Byron Boots", booktitle = {Proceedings of the Conference on Artificial Intelligence (AAAI-2016)},
    Title = "Online Instrumental Variable Regression with Applications to Online Linear System Identification",
    year = {2016}
    }
    X. Yan, V. Indelman, & B. Boots. Incremental Sparse GP Regression for Continuous-time Trajectory Estimation & Mapping.
    2015 Proceedings of the 17th International Symposium on Robotics Research (ISRR-2015)
    Abstract: Recent work on simultaneous trajectory estimation and mapping (STEAM) for mobile robots has found success by representing the trajectory as a Gaussian process. Gaussian processes can represent a continuous-time trajectory, elegantly handle asynchronous and sparse measurements, and allow the robot to query the trajectory to recover its estimated position at any time of interest. A major drawback of this approach is that STEAM is formulated as a batch estimation problem. In this paper we provide the critical extensions necessary to transform the existing batch algorithm into an extremely efficient incremental algorithm. In particular, we are able to vastly speed up the solution time through efficient variable reordering and incremental sparse updates, which we believe will greatly increase the practicality of Gaussian process methods for robot mapping and localization. Finally, we demonstrate the approach and its advantages on both synthetic and real datasets.
    BibTeX:
    @inproceedings{Yan-ISRR-15,
    Author = "Xinyan Yan and Vadim Indelman and Byron Boots", booktitle = {Proceedings of the International Symposium on Robotics Research (ISRR-2015)},
    Title = "Incremental Sparse {GP} Regression for Continuous-time Trajectory Estimation \& Mapping",
    year = {2015}
    }
    A. Shaban, M. Farajtabar, B. Xie, L. Song, & B. Boots. Learning Latent Variable Models by Improving Spectral Solutions with Exterior Point Methods.
    (34% Acceptance Rate)
    2015 Proceedings of the 31st Conference on Uncertainty in Artificial Intelligence (UAI-2015) 
    Abstract: Probabilistic latent-variable models are a fundamental tool in statistics and machine learning. Despite their widespread use, identifying the parameters of basic latent variable models continues to be an extremely challenging problem. Traditional maximum likelihood-based learning algorithms find valid parameters, but suffer from high computational cost, slow convergence, and local optima. In contrast, recently developed method of moments-based algorithms are computationally efficient and provide strong statistical guarantees, but are not guaranteed to find valid parameters. In this work, we introduce a two-stage learning algorithm for latent variable models. We first use method of moments to find a solution that is close to the optimal solution but not necessarily in the valid set of model parameters. We then incrementally refine the solution via exterior point optimization until a local optima that is arbitrarily near the valid set of parameters is found. We perform several experiments on synthetic and real-world data and show that our approach is more accurate than previous work, especially when training data is limited.
    BibTeX:
    @inproceedings{Shaban-UAI-15,
    Author = "Amirreza Shaban and Mehrdad Farajtabar and Bo Xie and Le Song and Byron Boots", booktitle = {Proceedings of The International Conference on Uncertainty in Artificial Intelligence (UAI-2015)},
    Title = "Learning Latent Variable Models by Improving Spectral Solutions with Exterior Point Methods",
    year = {2015}
    }
    A. Byravan, M. Monfort, B. Ziebart, B. Boots, & D. Fox. Graph-based Inverse Optimal Control for Robot Manipulation. (28% Acceptance Rate) 2015 Proceedings of the 24th International Joint Conference on Artificial Intelligence (IJCAI-2015)
    Abstract: Inverse optimal control (IOC) is a powerful approach for learning robotic controllers from demonstration that estimates a cost function which rationalizes demonstrated control trajectories. Unfortunately, its applicability is difficult in settings where optimal control can only be solved approximately. While local IOC approaches have been shown to successfully learn cost functions in such settings, they rely on the availability of good reference trajectories, which might not be available at test time. We address the problem of using IOC in these computationally challenging control tasks by using a graph-based discretization of the trajectory space. Our approach projects continuous demonstrations onto this discrete graph, where a cost function can be tractably learned via IOC. Discrete control trajectories from the graph are then projected back to the original space and locally optimized using the learned cost function. We demonstrate the effectiveness of the approach with experiments conducted on two 7-degree of freedom robotic arms.
    BibTeX:
    @inproceedings{Byravan-IJCAI-15,
    Author = "Arunkumar Byravan and Matthew Montfort and Brian Ziebart and Byron Boots and Dieter Fox", booktitle = {Proceedings of The International Joint Conference on Artificial Intelligence (IJCAI-2015)},
    Title = "Layered Hybrid Inverse Optimal Control for Learning Robot Manipulation from Demonstration",
    year = {2015}
    }
    B. Boots, A. Byravan, & D. Fox. Learning Predictive Models of a Depth Camera & Manipulator from Raw Execution Traces.
    (48% Acceptance Rate)
    2014 Proceedings of the 2014 IEEE Conference on Robotics and Automation (ICRA-2014) 
    Abstract: We attack the problem of learning a predictive model of a depth camera and manipulator directly from raw execution traces. While the problem of learning manipulator models from visual and proprioceptive data has been addressed before, existing techniques often rely on assumptions about the structure of the robot or tracked features in observation space. We make no such assumptions. Instead, we formulate the problem as that of learning a high-dimensional controlled stochastic process. We leverage recent work on nonparametric predictive state representations to learn a generative model of the depth camera and robotic arm from sequences of uninterpreted actions and observations. We perform several experiments in which we demonstrate that our learned model can accurately predict future observations in response to sequences of motor commands.
    BibTeX:
    @inproceedings{Boots-Depthcam-Manipulator,
    Author = "Byron Boots and Arunkumar Byravan and Dieter Fox", booktitle = {Proceedings of The International Conference in Robotics and Automation (ICRA-2014)},
    Title = "Learning Predictive Models of a Depth Camera \& Manipulator from Raw Execution Traces",
    year = {2014}
    }
    A. Byravan, B. Boots, S. Srinivasa, & D. Fox. Space-Time Functional Gradient Optimization for Motion Planning. (48% Acceptance Rate) 2014 Proceedings of the 2014 IEEE Conference on Robotics and Automation (ICRA-2014) 
    Abstract: Functional gradient algorithms (e.g. CHOMP) have recently shown great promise for producing optimal motion for complex many degree-of-freedom robots. A key limitation of such algorithms, is the difficulty in incorporating constraints and cost functions that explicitly depend on time. We present T-CHOMP, a functional gradient algorithm that overcomes this limitation by directly optimizing in space-time. We outline a framework for joint space-time optimization, derive an efficient trajectory-wide update for maintaining time monotonicity, and demonstrate the significance of T-CHOMP over CHOMP in several scenarios. By manipulating time, T-CHOMP produces lower-cost trajectories leading to behavior that is meaningfully different from CHOMP.
    BibTeX:
    @inproceedings{Byravan-Space-Time,
    Author = "Arunkumar Byravan and Byron Boots and Siddhartha Srinivasa and Dieter Fox", booktitle = {Proceedings of The International Conference in Robotics and Automation (ICRA-2014)},
    Title = "Space-Time Functional Gradient Optimization for Motion Planning",
    year = {2014}
    }
    B. Boots, A. Gretton, & G. J. Gordon. Hilbert Space Embeddings of Predictive State Representations.
    (Selected for plenary presentation
    : 11% Acceptance Rate)
    2013 Proceedings of the 29th Conference on Uncertainty in Artificial Intelligence
    (UAI-2013) 
    Abstract: Predictive State Representations (PSRs) are an expressive class of models for controlled stochastic processes. PSRs represent state as a set of predictions of future observable events. Because PSRs are defined entirely in terms of observable data, statistically consistent estimates of PSR parameters can be learned efficiently by manipulating moments of observed training data. Most learning algorithms for PSRs have assumed that actions and observations are finite with low cardinality. In this paper, we generalize PSRs to infinite sets of observations and actions, using the recent concept of Hilbert space embeddings of distributions. The essence is to represent the state as a nonparametric conditional embedding operator in a Reproducing Kernel Hilbert Space (RKHS) and leverage recent work in kernel methods to estimate, predict, and update the representation. We show that these Hilbert space embeddings of PSRs are able to gracefully handle continuous actions and observations, and that our learned models outperform competing system identification algorithms on several prediction benchmarks.
    BibTeX:
    @inproceedings{Boots-HSE-PSRs,
    Author = "Byron Boots and Arthur Gretton and Geoffrey J. Gordon", booktitle = {Proceedings of the 29th International Conference on Uncertainty in Artificial Intelligence (UAI-2013)},
    Title = "Hilbert Space Embeddings of Predictive State Representations",
    year = {2013}
    }
    B. Boots & G. J. Gordon. A Spectral Learning Approach to Range-Only SLAM.
    (24% Acceptance Rate)
    2013 Proceedings of the 30th International Conference on Machine Learning
    (ICML-2013) 
    Abstract: We present a novel spectral learning algorithm for simultaneous localization and mapping (SLAM) from range data with known correspondences. This algorithm is an instance of a general spectral system identification framework, from which it inherits several desirable properties, including statistical consistency and no local optima. Compared with popular batch optimization or multiple-hypothesis tracking (MHT) methods for range-only SLAM, our spectral approach offers guaranteed low computational requirements and good tracking performance. Compared with popular extended Kalman filter (EKF) or extended information filter (EIF) approaches, and many MHT ones, our approach does not need to linearize a transition or measurement model; such linearizations can cause severe errors in EKFs and EIFs, and to a lesser extent MHT, particularly for the highly non-Gaussian posteriors encountered in range-only SLAM. We provide a theoretical analysis of our method, including finite-sample error bounds. Finally, we demonstrate on a real-world robotic SLAM problem that our algorithm is not only theoretically justified, but works well in practice: in a comparison of multiple methods, the lowest errors come from a combination of our algorithm with batch optimization, but our method alone produces nearly as good a result at far lower computational cost.
    BibTeX:
    @inproceedings{Boots-spectralROSLAM,
      Author = "Byron Boots and Geoffrey Gordon ",
      Booktitle = "Proceedings of the 30th International Conference on Machine Learning (ICML-2013)",
      Title = "A Spectral Learning Approach to Range-Only {SLAM}",
      Year = {2013}
    }
    B. Boots & G. J. Gordon. Two-Manifold Problems with Applications to Nonlinear System Identification. (27% Acceptance Rate) 2012 Proceedings of the 29th International Conference on Machine Learning
    (ICML-2012) 
    Abstract: Recently, there has been much interest in spectral approaches to learning manifolds---so-called kernel eigenmap methods. These
    methods have had some successes, but their applicability is limited because they are not robust to noise. To address this limitation, we look at two-manifold problems, in which we simultaneously reconstruct two related manifolds, each representing a different view of the same data. By solving these interconnected learning problems together, two-manifold algorithms are able to succeed where a non-integrated approach would fail: each view allows us to suppress noise in the other, reducing bias. We propose a class of algorithms for two-manifold problems, based on spectral decomposition of cross-covariance operators in Hilbert space, and discuss when two-manifold problems are useful. Finally, we demonstrate that solving a two-manifold problem can aid in learning a nonlinear dynamical system from limited data.
    BibTeX:
    @inproceedings{Boots-2-Manifold,
      Author = "Byron Boots and Geoffrey Gordon ",
      Booktitle = "Proceedings of the 29th International Conference on Machine Learning (ICML-2012)",
      Title = "Two-Manifold Problems with Applications to Nonlinear System Identification",
      Year = {2012}
    }
    B. Boots & G. J. Gordon. An Online Spectral Learning Algorithm for Partially Observable Nonlinear Dynamical Systems.
    (25% Acceptance Rate)
    2011 Proceedings of the 25th Conference on Artificial Intelligence
    (AAAI-2011)
    Abstract: Recently, a number of researchers have proposed spectral algorithms for learning models of dynamical systems---for example, Hidden Markov Models (HMMs), Partially Observable Markov Decision Processes (POMDPs), and Transformed Predictive State Representations (TPSRs). These algorithms are attractive since they are statistically consistent and not subject to local optima. However, they are batch methods: they need to store their entire training data set in memory at once and operate on it as a large matrix, and so they cannot scale to extremely large data sets (either many examples or many features per example). In turn, this restriction limits their ability to learn accurate models of complex systems. To overcome these limitations, we propose a new online spectral algorithm, which uses tricks such as incremental SVD updates and random projections to scale to much larger data sets and more complex systems than previous methods. We demonstrate the new method on a high-bandwidth video mapping task, and illustrate desirable behaviors such as "closing the loop,'' where the latent state representation changes suddenly as the learner recognizes that it has returned to a previously known place.
    BibTeX:
    @inproceedings{Boots-online-psr,
      Author = "Byron Boots and Geoffrey Gordon ",
      Booktitle = "Proceedings of the 25th National Conference on Artificial Intelligence (AAAI-2011)",
      Title = "An Online Spectral Learning Algorithm for Partially Observable Nonlinear Dynamical Systems ",
      Year = "2011"
    }
    B. Boots, S. M. Siddiqi & G. J. Gordon. Closing the Learning-Planning Loop with Predictive State Representations. (Invited Journal Paper) 2011 The International Journal of Robotics Research (IJRR)
    Abstract: A central problem in artificial intelligence is to choose actions to maximize reward in a partially observable, uncertain environment. To do so, we must learn an accurate model of our environment, and then plan to maximize reward. Unfortunately, learning algorithms often recover a model which is too inaccurate to support planning or too large and complex for planning to be feasible; or, they require large amounts of prior domain knowledge or fail to provide important guarantees such as statistical consistency. To begin to fill this gap, we propose a novel algorithm which provably learns a compact, accurate model directly from sequences of action-observation pairs. To evaluate the learned model, we then close the loop from observations to actions: we plan in the learned model and recover a policy which is near-optimal in the original environment (not the model). In more detail, we present a spectral algorithm for learning a Predictive State Representation (PSR). We demonstrate the algorithm by learning a model of a simulated high-dimensional, vision-based mobile robot planning task, and then performing approximate point-based planning in the learned PSR. This experiment shows that the algorithm learns a state space which captures the essential features of the environment, allows accurate prediction with a small number of parameters, and enables successful and efficient planning. Our algorithm has several benefits which have not appeared together in any previous PSR learner: it is computationally efficient and statistically consistent; it handles high-dimensional observations and long time horizons by working from real-valued features of observation sequences; and finally, our close-the-loop experiments provide an end-to-end practical test.
    BibTeX:
    @inproceedings{Boots-2011b,
      Author = "Byron Boots and Sajid Siddiqi and Geoffrey Gordon ",
      Title = "Closing the Learning Planning Loop with Predictive State Representations",
      Journal = "International Journal of Robotics Research (IJRR)",
      Volume = "30", 
      Year = "2011",
      Pages = "954-956"
    }
    B. Boots & G. J. Gordon Predictive State Temporal Difference Learning.
    (24% Acceptance Rate)
    2011 Proceedings of Advances in Neural Information Processing Systems 24 (NIPS-2010)    
    Abstract: We propose a new approach to value function approximation which combines linear temporal difference reinforcement learning with subspace identification. In practical applications, reinforcement learning (RL) is complicated by the fact that state is either high-dimensional or partially observable. Therefore, RL methods are designed to work with features of state rather than state itself, and the success or failure of learning is often determined by the suitability of the selected features. By comparison, subspace identification (SSID) methods are designed to select a feature set which preserves as much information as possible about state. In this paper we connect the two approaches, looking at the problem of reinforcement learning with a large set of features, each of which may only be marginally useful for value function approximation. We introduce a new algorithm for this situation, called Predictive State Temporal Difference (PSTD) learning. As in SSID for predictive state representations, PSTD finds a linear compression operator that projects a large set of features down to a small set that preserves the maximum amount of predictive information. As in RL, PSTD then uses a Bellman recursion to estimate a value function. We discuss the connection between PSTD and prior approaches in RL and SSID. We prove that PSTD is statistically consistent, perform several experiments that illustrate its properties, and demonstrate its potential on a difficult optimal stopping problem.
    BibTeX:
    @inproceedings{Boots11a,	
      Author = "Byron Boots and Geoffrey J. Gordon ",
      Booktitle = "Proceedings of Advances in Neural Information Processing Systems 24 (NIPS-2011)",
      Title = "Predictive State Temporal Difference Learning",
      Year = {2011}
    }
    L. Song, B. Boots, S. M. Siddiqi, G. J. Gordon & A. J. Smola Hilbert Space Embeddings of Hidden Markov Models. 2010 Proceedings of the 27th International Conference on Machine Learning
    (ICML-2010) 
    Abstract: Hidden Markov Models (HMMs) are important tools for modeling sequence data. However, they are restricted to discrete latent states, and are largely restricted to Gaussian and discrete observations. And, learning algorithms for HMMs have predominantly relied on local search heuristics, with the exception of spectral methods such as those described below. We propose a Hilbert space embedding of HMMs that extends traditional HMMs to structured and non-Gaussian continuous distributions. Furthermore, we derive a local-minimum-free kernel spectral algorithm for learning these HMMs. We apply our method to robot vision data, slot car inertial sensor data and audio event classification data, and show that in these applications, embedded HMMs exceed the previous state-of-the-art performance.
    BibTeX:
    @inproceedings{Song:2010fk,	
      Author = "L. Song and B. Boots and S. M. Siddiqi and G. J. Gordon and A. J. Smola",
      Booktitle = "Proceedings of the 27th International Conference on Machine Learning (ICML-210)",
      Title = "Hilbert Space Embeddings of Hidden {M}arkov Models",
      Year = {2010}
    }
    B. Boots, S. M. Siddiqi & G. J. Gordon. Closing the Learning-Planning Loop with Predictive State Representations.
    (Selected for plenary presentation: 8% acceptance rate)
    2010 Proceedings of Robotics: Science and Systems VI
    (RSS-2010)
    Abstract: A central problem in artificial intelligence is to choose actions to maximize reward in a partially observable, uncertain environment. To do so, we must learn an accurate model of our environment, and then plan to maximize reward. Unfortunately, learning algorithms often recover a model which is too inaccurate to support planning or too large and complex for planning to be feasible; or, they require large amounts of prior domain knowledge or fail to provide important guarantees such as statistical consistency. To begin to fill this gap, we propose a novel algorithm which provably learns a compact, accurate model directly from sequences of action-observation pairs. To evaluate the learned model, we then close the loop from observations to actions: we plan in the learned model and recover a policy which is near-optimal in the original environment (not the model). In more detail, we present a spectral algorithm for learning a Predictive State Representation (PSR). We demonstrate the algorithm by learning a model of a simulated high-dimensional, vision-based mobile robot planning task, and then performing approximate point-based planning in the learned PSR. This experiment shows that the algorithm learns a state space which captures the essential features of the environment, allows accurate prediction with a small number of parameters, and enables successful and efficient planning. Our algorithm has several benefits which have not appeared together in any previous PSR learner: it is computationally efficient and statistically consistent; it handles high-dimensional observations and long time horizons by working from real-valued features of observation sequences; and finally, our close-the-loop experiments provide an end-to-end practical test.
    BibTeX:
     @inproceedings{Boots-RSS-10,
       Author = "B. Boots AND S. Siddiqi and G. Gordon",
       Title = "Closing the Learning-Planning Loop with Predictive State Representations",
       Booktitle = "Proceedings of Robotics: Science and Systems VI (RSS-2010)",
       Year = "2010",
       Address = "Zaragoza, Spain",
       Month = "June"
    }
                  
    S. M. Siddiqi, B. Boots & G. J. Gordon. Reduced-Rank Hidden Markov Models.
    (Selected for plenary presentation: 8% acceptance rate)
    2010 Proceedings of the 13th International Conference on Artificial Intelligence and Statistics (AISTATS-2010)  
    Abstract: Hsu et al. (2009) recenly proposed an efficient, accurate spectral learning algorirhm for Hidden Markov Models (HMMs). In this paper we relax their assumptions and prove a tighter finite-sample error bound for the case of Reduced-Rank HMMs, i.e., HMMs with low-rank transition matrices. Since rank-k RR-HMMs are a larger class of models than k-state HMMs while being equally efficient to work with, this relaxation greatly increases the learning algorithm's scope. In addition, we generalize the algorithm and bounds to models where multiple observations are needed to disambiguate state, and to models that emit multivariate real-valued observations. Finally we prove consistency for learning Predictive State Representations, an even larger class of models. Experiments on synthetic data and a toy video, as well as on difficult robot vision data, yeild accurate models that compare favorably with alternatives in simulation quality and prediction accuracy.
    BibTeX:
    @inproceedings{Siddiqi10a,
      author = "Sajid Siddiqi and Byron Boots and Geoffrey J. Gordon",
      title = "Reduced-Rank Hidden {Markov} Models",
      booktitle = "Proceedings of the Thirteenth International Conference 
      on Artificial Intelligence and Statistics (AISTATS-2010)",
      year = "2010"
    }
    B. Boots, S. M. Siddiqi & G. J. Gordon. Closing the Learning-Planning Loop with Predictive State Representations. (Short paper: for longer version see paper accepted at RSS above) 2010 Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems
    (AAMAS-2010)
    Abstract: A central problem in artificial intelligence is to plan to maximize future reward under uncertainty in a partially observable environment. Models of such environments include Partially Observable Markov Decision Processes (POMDPs) as well as their generalizations, Predictive State Representations (PSRs) and Observable Operator Models (OOMs). POMDPs model the state of the world as a latent variable; in contrast, PSRs and OOMs represent state by tracking occurrence probabilities of a set of future events (called tests or characteristic events) conditioned on past events (called histories or indicative events). Unfortunately, exact planning algorithms such as value iteration are intractable for most realistic POMDPs due to the curse of history and the curse of dimensionality. However, PSRs and OOMs hold the promise of mitigating both of these curses: first, many successful approximate planning techniques designed to address these problems in POMDPs can easily be adapted to PSRs and OOMs. Second, PSRs and OOMs are often more compact than their corresponding POMDPs (i.e., need fewer state dimensions), mitigating the curse of dimensionality. Finally, since tests and histories are observable quantities, it has been suggested that PSRs and OOMs should be easier to learn than POMDPs; with a successful learning algorithm, we can look for a model which ignores all but the most important components of state, reducing dimensionality still further.
    BibTeX:
    @inproceedings{Boots10a,
      author = "Byron Boots and Sajid Siddiqi and Geoffrey J. Gordon",
      title = "Closing the Learning-Planning Loop with Predictive State Representations",
      booktitle = "Proceedings of the 9th International Conference on Autonomous 
      Agents and Multiagent Systems (AAMAS-2010)",
      year = "2010"
    }
    S. M. Siddiqi, B. Boots & G. J. Gordon. A Constraint Generation Approach to Learning Stable Linear Dynamical Systems.
    (Selected for plenary presentation: 2.5% acceptance rate)
    2008 Proceedings of Advances in Neural Information Processing Systems 21 (NIPS-2007)  
    Abstract: Stability is a desirable characteristic for linear dynamical systems, but it is often ignored by algorithms that learn these systems from data. We propose a novel method for learning stable linear dynamical systems: we formulate an approximation of the problem as a convex program, start with a solution to a relaxed version of the program, and incrementally add constraints to improve stability. Rather than continuing to generate constraints until we reach a feasible solution, we test stability at each step; because the convex program is only an approximation of the desired problem, this early stopping rule can yield a higher-quality solution. We apply our algorithm to the task of learning dynamic textures from image sequences as well as to modeling biosurveillance drug-sales data. The constraint generation approach leads to noticeable improvement in the quality of simulated sequences. We compare our method to those of Lacy and Bernstein, with positive results in terms of accuracy, quality of simulated sequences, and efficiency.
    BibTeX:
    @inproceedings{Siddiqi07b,
      author = "Sajid Siddiqi and Byron Boots and Geoffrey J. Gordon",
      title = "A Constraint Generation Approach to Learning Stable Linear Dynamical Systems",
      booktitle = "Proceedings of Advances in Neural Information Processing Systems 20 (NIPS-07)",
      year = "2007"
    }
    S. M. Siddiqi, B. Boots G. J. Gordon, & A. W. Dubrawski Learning Stable Multivariate Baseline Models for Outbreak Detection. 2007 Advances in Disease Surveillance
    Abstract: We propose a novel technique for building generative models of real-valued multivariate time series data streams. Such models are of considerable utility as baseline simulators in anomaly detection systems. The proposed algorithm, based on Linear Dynamical Systems (LDS), learns stable parameters efficiently while yeilding more accurate results than previously known methods. The resulting model can be used to generate infinitely long sequences of realistic baselines using small samples of training data.
    BibTeX:
    @article{Siddiqi07c,
      author = "Sajid Siddiqi and Byron Boots and Geoffrey J. Gordon and Artur W. Dubrawski",
      title = "Learning Stable Multivariate Baseline Models for Outbreak Detection",
      journal = "Advances in Disease Surveillance",
      year = "2007",
      volume = {4},
      pages = {266}
    }
    
    B. Boots, S. Nundy & D. Purves. Evolution of Visually Guided Behavior in Artificial Agents. 2007 Network: Computation in Neural Systems  
    Abstract: Recent work on brightness, color, and form has suggested that human visual percepts represent the probable sources of retinal images rather than stimulus features as such. Here we investigate the plausibility of this empirical concept of vision by allowing autonomous agents to evolve in virtual environments based solely on the relative success of their behavior. The responses of evolved agents to visual stimuli indicate that fitness improves as the neural network control systems gradually incorporate the statistical relationship between projected images and behavior appropriate to the sources of the inherently ambiguous images. These results: (1) demonstrate the merits of a wholly empirical strategy of animal vision as a means of contending with the inverse optics problem; (2) argue that the information incorporated into biological visual processing circuitry is the relationship between images and their probable sources; and (3) suggest why human percepts do not map neatly onto physical reality.
    BibTeX:
    @article{Boots2007a,
      author = "Byron Boots and Surajit Nundy and Dale Purves",
      title = "Evolution of Visually Guided Behavior in Artificial Agents",
      journal = "Network: Computation in Neural Systems",
      year = "2007",
      volume = {18(1)},
      pages = {11--34}
    }
    S. Majercik & B. Boots DC-SSAT: A Divide-and-Conquer Approach to Solving Stochastic Satisfiability Problems Efficiently.
    (18% acceptance rate)
    2005 Proceedings of the 20th National Conference on Artificial Intelligence
    (AAAI-2005)
    Abstract: We present DC-SSAT, a sound and complete divide-and-conquer algorithm for solving stochastic satisfiability (SSAT) problems that outperforms the best existing algorithm for solving such problems (ZANDER) by several orders of magnitude with respect to both time and space. DC-SSAT achieves this performance by dividing the SSAT problem into subproblems based on the structure of the original instance, caching the viable partial assignments (VPAs) generated by solving these subproblems, and using these VPAs to construct the solution to the original problem. DC-SSAT does not save redundant VPAs and each VPA saved is necessary to construct the solution. Furthermore, DC-SSAT builds a solution that is already human-comprehensible, allowing it to avoid the costly solution rebuilding phase in ZANDER. As a result, DC-SSAT is able to solve problems using, typically, 1-2 orders of magnitude less space than ZANDER, allowing DC-SSAT to solve problems ZANDER cannot solve due to space constraints. And, in spite of its more parsimonious use of space, DC-SSAT is typically 1-2 orders of magnitude faster than ZANDER. We describe the DC-SSAT algorithm and present empirical results comparing its performance to that of ZANDER on a set of SSAT problems.
    BibTeX:
    @inproceedings{Majercik2005,
      author = "Stephen M. Majercik and Byron Boots",
      title = "DC-SSAT: A Divide-and-Conquer Approach to Solving 
      Stochastic Satisfiability Problems Efficiently",
      booktitle = "Proceedings of the Twentieth National Conference 
      on Artificial Intelligence (AAAI-05)",
      year = "2005",
      note = "AAAI-05"
    }

    Refereed Abstracts & Workshop Publications
    Authors Title Year Venue
    Y. Pan, X. Yan, E. Theodorou, & B. Boots. Scalable Reinforcement Learning via Trajectory Optimization and Approximate Gaussian Process Regression. 2015 NIPS Workshop on Advances in Approximate Bayesian Inference
    Abstract: In order to design an efficient RL algorithm, we combine the attractive characteristics of two approaches: local trajectory optimization and random feature approximations. Local trajectory optimization methods, such as Differential Dynamic Programming (DDP), are a class of approaches for solving nonlinear optimal control problems. Compared to global approaches, DDP shows superior computational efficiency and scalability to high-dimensional problems. The principal limitation of DDP is that it relies on accurate and explicit representation of the dynamics. In this work we take a nonparametric approach to learn the dynamics based on Gaussian processes (GPs). GPs have demonstrated encouraging performance in modeling dynamical systems, but are also computationally expensive and do not scale to moderate or large datasets. While a number of approximation methods exist, sparse spectrum Gaussian process regression (SSGPR) stands out with a superior combination of efficiency and accuracy. By combining the benefits of both DDP and SSGPR, we show that our approach is able to scale to high-dimensional dynamical systems and large datasets.
    BibTeX:
    @article{PanNIPSWS15,
      author = "Yunpeng Pan and Xinyan Yan and Evangelos Theodorou and Byron Boots",
      title = "Scalable Reinforcement Learning via Trajectory Optimization and Approximate {G}aussian Process Regression",
      journal = "NIPS Workshop on Advances in Approximate Bayesian Inference",
      year = "2015"
    }
    
    A. Shaban, M. Farajtabar, B. Xie, L. Song, & B. Boots. Learning Latent Variable Models by Improving Spectral Solutions with Exterior Point Methods. 2015 NIPS Workshop on Non-convex Optimization for Machine Learning: Theory and Practice
    Abstract: Probabilistic latent-variable models are a fundamental tool in statistics and machine learning. Despite their widespread use, identifying the parameters of basic latent variable models continues to be an extremely challenging problem. Traditional maximum likelihood-based learning algorithms find valid parameters, but suffer from high computational cost, slow convergence, and local optima. In contrast, recently developed method of moments-based algorithms are computationally efficient and provide strong statistical guarantees, but are not guaranteed to find valid parameters. In this work, we introduce a two-stage learning algorithm for latent variable models. We first use method of moments to find a solution that is close to the optimal solution but not necessarily in the valid set of model parameters. We then incrementally refine the solution via exterior point optimization until a local optima that is arbitrarily near the valid set of parameters is found. We perform several experiments on synthetic and real-world data and show that our approach is more accurate than previous work, especially when training data is limited.
    BibTeX:
    @article{ShabanNIPSWS15,
      author = "Amirreza Shaban and  Mehrdad Farajtabar and Bo Xie and Le Song and Byron Boots",
      title = "Learning Latent Variable Models by Improving Spectral Solutions with Exterior Point Methods",
      journal = "NIPS Workshop on Non-convex Optimization for Machine Learning: Theory and Practice",
      year = "2015"
    }
    
    X. Yan, V. Indelman, & B. Boots. Incremental Sparse GP Regression for Continuous-time Trajectory Estimation & Mapping. 2015 RSS Workshop on the Problem of Mobile Sensors: Setting future goals and indicators of progress for SLAM.
    Abstract: Recent work on simultaneous trajectory estimation and mapping (STEAM) for mobile robots has found success by representing the trajectory as a Gaussian process. Gaussian processes can represent a continuous-time trajectory, elegantly handle asynchronous and sparse measurements, and allow the robot to query the trajectory to recover its estimated position at any time of interest. A major drawback of this approach is that STEAM is formulated as a batch estimation problem. In this paper we provide the critical extensions necessary to transform the existing batch algorithm into an extremely efficient incremental algorithm. In particular, we are able to vastly speed up the solution time through efficient variable reordering and incremental sparse updates, which we believe will greatly increase the practicality of Gaussian process methods for robot mapping and localization. Finally, we demonstrate the approach and its advantages on both synthetic and real datasets.
    BibTeX:
    @article{YanRSSWS15,
      author = "Xinyan Yan and Vadim Indelman and Byron Boots",
      title = "Incremental Sparse {GP} Regression for Continuous-time Trajectory Estimation \& Mapping",
      journal = "The Problem of Mobile Sensors: Setting future goals and indicators of progress for SLAM",
      year = "2015"
    }
    
    X. Yan, B. Xie, L. Song, & B. Boots. Large-Scale Gaussian Process Regression via Doubly Stochastic Gradient Descent. 2015 The ICML Workshop on Large-Scale Kernel Learning: Challenges and New Opportunities
    Abstract: Gaussian process regression (GPR) is a popular tool for nonlinear function approximation. Unfortunately, GPR can be difficult to use in practice due to the O(n^2) memory and O(n ^3) processing requirements for n training data points. We propose a novel approach to scaling up GPR to handle large datasets using the recent concept of doubly stochastic functional gradients. Our approach relies on the fact that GPR can be expressed as a convex optimization problem that can be solved by making two unbiased stochastic approximations to the functional gradient, one using random training points and another using random features, and then descending using this noisy functional gradient. The effectiveness of the resulting algorithm is evaluated on the wellknown problem of learning the inverse dynamics of a robot manipulator.
    BibTeX:
    @article{YanICMLWS15,
      author = "Xinyan Yan and Bo Xie and Le Song and Byron Boots",
      title = "Large-Scale Gaussian Process Regression via Doubly Stochastic Gradient Descent",
      journal = "The ICML Workshop on Large-Scale Kernel Learning",
      year = "2015"
    }
    
    X. Yan, V. Indelman, & B. Boots. Incremental Sparse Gaussian Process Regression for Continuous-time Trajectory Estimation & Mapping. 2014 NIPS Workshop on Autonomously Learning Robots
    Abstract: Recent work has investigated the problem of continuous-time trajectory estimation and mapping for mobile robots by formulating the problem as sparse Gaussian process regression. Gaussian processes provide a continuous-time representation of the robot trajectory, which elegantly handles asynchronous and sparse measurements, and allows the robot to query the trajectory to recover it’s estimated position at any time of interest. One of the major drawbacks of this approach is that Gaussian process regression formulates continuous-time trajectory estimation as a batch estimation problem. In this work, we provide the critical extensions necessary to transform this existing batch approach into an extremely efficient incremental approach. In particular, we are able to vastly speed up the solution time through efficient variable reordering and incremental sparse updates, which we believe will greatly increase the practicality of Gaussian process methods for robot mapping and localization. Finally, we demonstrate the approach and its advantages on both synthetic and real datasets.
    BibTeX:
    @article{BootsNIPSWS14,
      author = "Xinyan Yan and Vadim Indelman and Byron Boots",
      title = "Incremental Sparse GP Regression for Continuous-time Trajectory Estimation & Mapping",
      journal = "NIPS Workshop on Autonomously Learning Robots",
      year = "2014"
    }
    
    Z. Marinho, A. Dragan, A. Byravan, S. Srinivasa, G. Gordon, & B. Boots. Functional Gradient Motion Planning in Reproducing Kernel Hilbert Spaces. 2014 NIPS Workshop on Autonomously Learning Robots
    Abstract: We introduce a functional gradient descent based trajectory optimization algorithm for robot motion planning in arbitrary Reproducing Kernel Hilbert Spaces (RKHSs). Functional gradient algorithms are a popular choice for motion planning in complex many degree-of-freedom robots. In theory, these algorithms work by directly optimizing continuous trajectories to avoid obstacles while maintaining smoothness. However, in practice, functional gradient algorithms commit to a finite parametrization of the trajectories, often as a finite set of waypoints. Such a parametrization limits expressiveness, and can fail to produce smooth trajectories despite the inclusion of smoothness in the objective. As a result, we often observe practical problems such as slow convergence and the requirement to choose an inconveniently small step size. Our work generalizes the waypoint parametrization to arbitrary RKHSs by formulating trajectory optimization as minimization of a cost functional. We derive a gradient update method that is able to take larger steps and achieve a locally minimum trajectory in just a few iterations. Depending on the selection of a kernel, we can directly optimize in spaces of continuous trajectories that are inherently smooth, and that have a low-dimensional, adaptively chosen parametrization. Our experiments illustrate the effectiveness of the planner for two different kernels, RBFs and B-splines, as compared to the standard discretized waypoint representation.
    BibTeX:
    @article{Marinho14,
      author = "Zita Marinho and Anca Dragan and Arun Byravan and
      Siddhartha Srinivasa and Geoffrey J. Gordon and Byron Boots",
      title = "Functional Gradient Motion Planning in Reproducing Kernel Hilbert Spaces",
      journal = "NIPS Workshop on Autonomously Learning Robots",
      year = "2014"
    }
    
    A. Venkatraman,
    B. Boots, M. Hebert, & J. A. Bagnell.
    Data as Demonstrator with Applications to System Identification. 2014 NIPS Workshop on Autonomously Learning Robots
    Abstract: Machine learning techniques for system identification and time series modeling often phrase the problem as the optimization of a loss function over a single timestep prediction. However, in many applications, the learned model is recursively applied in order to make a multiple-step prediction, resulting in compounding prediction errors. We present DATA AS DEMONSTRATOR, an approach that reuses training data to make a no-regret learner robust to errors made during multistep prediction. We present results on the task of linear system identification applied to a simulated system and to a real-world dataset.
    BibTeX:
    @article{Venkatraman14,
      author = "Arun Venkatraman and Byron Boots and Martial Hebert and James Bagnell",
      title = "Data as Demonstrator with Applications to System Identification",
      journal = "NIPS Workshop on Autonomously Learning Robots",
      year = "2014"
    }
    
    A. Byravan, M. Montfort, B. Ziebart,
    B. Boots, & D. Fox.
    Layered Hybrid Inverse Optimal Control for Learning Robot Manipulation from Demonstration. 2014 NIPS Workshop on Autonomously Learning Robots
    Abstract: Inverse optimal control (IOC) is a powerful approach for learning robotic controllers from demonstration that estimates a cost function which rationalizes demonstrated control trajectories. Unfortunately, its applicability is difficult in settings where optimal control can only be solved approximately. Local IOC approaches rationalize demonstrated trajectories based on a linear-quadratic approximation around a good reference trajectory (i.e., the demonstrated trajectory itself). Without this same reference trajectory, however, dissimilar control results. We address the complementary problem of using IOC to find appropriate reference trajectories in these computationally challenging control tasks. After discussing the inherent difficulties of learning within this setting, we present a projection technique from the original trajectory space to a discrete and tractable trajectory space and perform IOC on this space. Control trajectories are projected back to the original space and locally optimized. We demonstrate the effectiveness of the approach with experiments conducted on a 7-degree of freedom robotic arm.
    BibTeX:
    @article{Byravan14,
      author = "Arunkumar Byravan and Matthew Montfort and Brian Ziebart and Byron Boots and Dieter Fox",
      title = "Layered Hybrid Inverse Optimal Control for Learning Robot Manipulation from Demonstration",
      journal = "NIPS Workshop on Autonomously Learning Robots",
      year = "2014"
    }
    
    B. Boots, A. Byravan, & D. Fox. Learning Predictive Models of a Depth Camera & Manipulator from Raw Execution Traces. 2013 NIPS Workshop on Advances in Machine Learning for Sensorimotor Control
    Abstract: We attack the problem of learning a predictive model of a depth camera and manipulator directly from raw execution traces. While the problem of learning manipulator models from visual and proprioceptive data has been addressed before, existing techniques often rely on assumptions about the structure of the robot or tracked features in observation space. We make no such assumptions. Instead, we formulate the problem as that of learning a high-dimensional controlled stochastic process. We leverage recent work on nonparametric predictive state representations to learn a generative model of the depth camera and robotic arm from sequences of uninterpreted actions and observations. We perform several experiments in which we demonstrate that our learned model can accurately predict future observations in response to sequences of motor commands.
    BibTeX:
    @article{Boots13NIPSa,
      author = "Byron Boots and Arunkumar Byravan and Dieter Fox",
      title = "Learning Predictive Models of a Depth Camera & Manipulator from Raw Execution Traces",
      journal = "NIPS Workshop on Advances in Machine Learning for Sensorimotor Control",
      year = "2013"
    }
    
    B. Boots & D. Fox. Learning Dynamic Policies from Demonstration. 2013 NIPS Workshop on Advances in Machine Learning for Sensorimotor Control
    Abstract: We address the problem of learning a policy directly from expert demonstrations. Typically, this problem is solved with a supervised learning approach such as regression or classification resulting in a reactive policy. Unfortunately, reactive policies cannot model long-range dependancies, and this omission can result in suboptimal performance. We take a different approach. We observe that policies and dynamical systems are mathematical duals, and we use this fact to leverage the rich literature on system identification to learn policies from demonstration. System identification algorithms often have desirable properties like the ability to model long-range dependancies, statistical consistency, and efficient off-the-shelf implementations. We show that by employing system identification to learning from demonstration problems, all of these properties can be carried over to the learning from demonstration domain resulting in improved practical performance.
    BibTeX:
    @article{Boots13NIPSb,
      author = "Byron Boots and Dieter Fox",
      title = "Learning Dynamic Policies from Demonstration",
      journal = "NIPS Workshop on Advances in Machine Learning for Sensorimotor Control",
      year = "2013"
    }
    
    B. Boots, A. Gretton, &
    G. J. Gordon.
    Hilbert Space Embeddings of PSRs. 2013 ICML Workshop on Machine Learning and System Identification
    Abstract: We fully generalize PSRs to continuous observations and actions using a recent concept called Hilbert space embeddings of distributions. The essence of our method is to represent distributions of tests, histories, observations, and actions, as points in (possibly) infinite-dimensional Reproducing Kernel Hilbert Spaces (RKHSs). During filtering we update these distributions using a kernel version of Bayes' rule. To improve computational tractability, we develop a spectral system identification method to learn a succinct parameterization of the target system.
    BibTeX:
    @article{Boots13ICML,
      author = "Byron Boots, Arthur Gretton, and Geoffrey J. Gordon",
      title = "Hilbert Space Embeddings of PSRs",
      journal = "ICML workshop on Machine Learning and System Identification (MLSYSID)",
      year = "2013"
    }
    
    B. Boots, A. Gretton, &
    G. J. Gordon.
    Hilbert Space Embeddings of PSRs. 2012 NIPS Workshop on Spectral Algorithms for Latent Variable Models
    Abstract: We fully generalize PSRs to continuous observations and actions using a recent concept called Hilbert space embeddings of distributions. The essence of our method is to represent distributions of tests, histories, observations, and actions, as points in (possibly) infinite-dimensional Reproducing Kernel Hilbert Spaces (RKHSs). During filtering we update these distributions using a kernel version of Bayes' rule. To improve computational tractability, we develop a spectral system identification method to learn a succinct parameterization of the target system.
    BibTeX:
    @article{Boots12NIPSb,
      author = "Byron Boots, Arthur Gretton, and Geoffrey J. Gordon",
      title = "Hilbert Space Embeddings of PSRs",
      journal = "NIPS Workshop on Spectral Algorithms for Latent Variable Models",
      year = "2012"
    }
    
    B. Boots, & G. J. Gordon. A Spectral Learning Approach to Range-Only SLAM. 2012 NIPS Workshop on Spectral Algorithms for Latent Variable Models
    Abstract: We present a novel spectral learning algorithm for simultaneous localization and mapping (SLAM) from range data with known correspondences. This algorithm is an instance of a general spectral system identification framework, from which it inherits several desirable properties, including statistical consistency and no local optima. Compared with popular batch optimization or multiple-hypothesis tracking methods for range-only SLAM, our spectral approach offers guaranteed low computational requirements and good tracking performance.
    BibTeX:
    @article{Boots12NIPSa,
      author = "Byron Boots and Geoffrey J. Gordon",
      title = "A Spectral Learning Approach to Range-Only SLAM",
      journal = "NIPS Workshop on Spectral Algorithms for Latent Variable Models",
      year = "2012"
    }
    
    B. Boots & G. J. Gordon. Online Spectral Identification of Dynamical Systems. 2011 NIPS Workshop on Sparse Representation and Low-rank Approximation
    Abstract: Recently, a number of researchers have proposed spectral algorithms for learning models of dynamical systems---for example, Hidden Markov Models (HMMs), Partially Observable Markov Decision Processes (POMDPs), and Transformed Predictive State Representations (TPSRs). These algorithms are attractive since they are statistically consistent and not subject to local optima. However, they are batch methods: they need to store their entire training data set in memory at once and operate on it as a large matrix, and so they cannot scale to extremely large data sets (either many examples or many features per example). In turn, this restriction limits their ability to learn accurate models of complex systems. To overcome these limitations, we propose a new online spectral algorithm, which uses tricks such as incremental SVD updates and random projections to scale to much larger data sets and more complex systems than previous methods. We demonstrate the new method on a high-bandwidth video mapping task, and illustrate desirable behaviors such as "closing the loop,'' where the latent state representation changes suddenly as the learner recognizes that it has returned to a previously known place.
    BibTeX:
    @inproceedings{Boots11nips,
      author = "Byron Boots and Geoffrey J. Gordon",
      title = "Online Spectral Identification of Dynamical Systems",
      journal = "NIPS Workshop on Sparse Representation and Low-rank Approximation",
      year = "2011"
    }
    B. Boots, S. M. Siddiqi & G. J. Gordon. Closing the Learning-Planning Loop with Predictive State. Representations. 2009 NIPS Workshop on Probabilistic Approaches for Robotics and Control
    Abstract: We propose a principled and provably statistically consistent model-learning algorithm, and demonstrate positive results on a challenging high-dimensional problem with continuous observations. In particular, we propose a novel, consistent spectral algorithm for learning a variant of PSRs called Transformed PSRs (TPSRs) directly from execution traces.
    BibTeX:
    @article{Boots09nips,
      author = "Byron Boots and Sajid M. Siddiqi and Geoffrey J. Gordon",
      title = "Closing the Learning-Planning Loop with Predictive State Representations",
      journal = "NIPS Workshop on Probabilistic Approaches for Robotics and Control",
      year = "2009"
    }
    
    D. Purves, & B. Boots Evolution of Visually Guided Behavior in Artificial Agents. 2006 Vision Science Society Annual Meeting/Journal of Vision
    Abstract: Recent work on brightness, color and form has suggested that human visual percepts represent the probable sources of retinal images rather than stimulus features as such. We have investigated this empirical concept of vision by asking whether agents using neural network control systems evolve successful visually guided behavior based solely on the statistical relationship of images on their sensor arrays and the probable sources of the images in a simulated environment. A virtual environment was created with OpenGL consisting of an arena with a central obstacle, similar to arenas used in evolutionary robotics experiments. The neural control system for each agent comprised a single-layer, feed-forward network that connected all 256 inputs from a sensor array to two output nodes that encoded rotation and translation responses. Each agent's behavioral actions in the environment were evaluated, and the fittest individuals selected to produce a new population according to a standard genetic algorithm. This process was repeated until the average fitness of subsequent generations reached a plateau. Analysis of the actions of evolved agents in response to visual input showed their neural network control systems had incorporated the statistical relationship between projected images and their possible sources, and that this information was used to produce increasingly successful visually guided behavior. The simplicity of this paradigm notwithstanding, these results support the idea that biological vision has evolved to solve the inverse problem on a wholly empirical basis, and provide a novel way of exploring visual processing.
    BibTeX:
    @article{Purvesn2006,
      author = "Dale Purves and Byron Boots",
      title = "Evolution of Visually Guided Behavior in Artificial Agents",
      journal = "Journal of Vision",
      year = {2006},
      volume = {6(6)},
      pages = {356a}
    }
    

    Technical Reports & Book Chapters
    Authors Title Year Tech. Report/Book
    Y. Pan, X. Yan, E. Theodorou, & B. Boots Adaptive Probabilistic Trajectory Optimization via Efficient Approximate Inference. 2016 Technical Report arXiv:1608.06235
    Abstract: Robotic systems must be able to quickly and robustly make decisions when operating in uncertain and dynamic environments. While Reinforcement Learning (RL) can be used to compute optimal policies with little prior knowledge about the environment, it suffers from slow convergence. An alternative approach is Model Predictive Control (MPC), which optimizes policies quickly, but also requires accurate models of the system dynamics and environment. In this paper we propose a new approach, adaptive probabilistic trajectory optimization, that combines the benefits of RL and MPC. Our method uses scalable approximate inference to learn and updates probabilistic models in an online incremental fashion while also computing optimal control policies via successive local approximations. We present two variations of our algorithm based on the Sparse Spectrum Gaussian Process (SSGP) model, and we test our algorithm on three learning tasks, demonstrating the effectiveness and efficiency of our approach.
    BibTeX:
    @article{Pan16arxiv,
      author    = {Yunpeng Pan and Xinyan Yan and Evangelos Theodorou and Byron Boots},
      title     = {Adaptive Probabilistic Trajectory Optimization via Efficient Approximate Inference},
      journal   = {CoRR},
      volume    = {abs/1608.06235},
      year      = {2016},
      url       = {http://arxiv.org/abs/1608.06235}
    }
    
    B. Dai, N. He, Y. Pan, B. Boots, & L. Song Learning from Conditional Distributions via Dual Kernel Embeddings. 2016 Technical Report arXiv:1607.04579
    Abstract: In many machine learning problems, such as policy evaluation in reinforcement learning and learning with invariance, each data point x itself is a conditional distribution p(z|x), and we want to learn a function f which links these conditional distributions to target values y. The learning problem becomes very challenging when we only have limited samples or in the extreme case only one sample from each conditional distribution p(z|x). Commonly used approaches either assume that z is independent of x, or require an overwhelmingly large sample size from each conditional distribution. To address these challenges, we propose a novel approach which reformulates the original problem into a min-max optimization problem. In the new view, we only need to deal with the kernel embedding of the joint distribution p(z,x) which is easy to estimate. Furthermore, we design an efficient learning algorithm based on mirror descent stochastic approximation, and establish the sample complexity for learning from conditional distributions. Finally, numerical experiments in both synthetic and real data show that our method can significantly improve over the previous state-of-the-arts.
    BibTeX:
    @article{Dai16arxiv,
      author    = {Bo Dai and Niao He and Yunpeng Pan and Byron Boots and Le Song},
      title     = {Learning from Conditional Distributions via Dual Kernel Embeddings},
      journal   = {CoRR},
      volume    = {abs/1607.04579},
      year      = {2016},
      url       = {http://arxiv.org/abs/1607.04579}
    }
    
    Z. Marinho, A. Dragan, A. Byravan, B. Boots, S. Srinivasa, & G. J. Gordon. Functional Gradient Motion Planning in Reproducing Kernel Hilbert Space. 2016 Technical Report arXiv:1601.03648
    Abstract: We introduce a functional gradient descent trajectory optimization algorithm for robot motion planning in Reproducing Kernel Hilbert Spaces (RKHSs). Functional gradient algorithms are a popular choice for motion planning in complex many-degree-of-freedom robots, since they (in theory) work by directly optimizing within a space of continuous trajectories to avoid obstacles while maintaining geometric properties such as smoothness. However, in practice, functional gradient algorithms typically commit to a fixed, finite parameterization of trajectories, often as a list of waypoints. Such a parameterization can lose much of the benefit of reasoning in a continuous trajectory space: e.g., it can require taking an inconveniently small step size and large number of iterations to maintain smoothness. Our work generalizes functional gradient trajectory optimization by formulating it as minimization of a cost functional in an RKHS. This generalization lets us represent trajectories as linear combinations of kernel functions, without any need for waypoints. As a result, we are able to take larger steps and achieve a locally optimal trajectory in just a few iterations. Depending on the selection of kernel, we can directly optimize in spaces of trajectories that are inherently smooth in velocity, jerk, curvature, etc., and that have a low-dimensional, adaptively chosen parameterization. Our experiments illustrate the effectiveness of the planner for different kernels, including Gaussian RBFs, Laplacian RBFs, and B-splines, as compared to the standard discretized waypoint representation.
    BibTeX:
    @article{Marinho16arxiv,
      author    = {Zita Marinho and Anca Dragan and Arun Byravan and Byron Boots and Siddhartha Srinivasa and Geoffrey J. Gordon},
      title     = {Functional Gradient Motion Planning in Reproducing Kernel Hilbert Space},
      journal   = {CoRR},
      volume    = {abs/1601.03648},
      year      = {2016},
      url       = {http://arxiv.org/abs/1601.03648}
    }
    
    W. Sun, A. Venkatraman, B. Boots, & J. A. Bagnell. Learning to Filter with Predictive State Inference Machines. 2015 Technical Report arXiv:1512.08836
    Abstract: Latent state space models are one of the most fundamental and widely used tools for modeling dynamical systems. Traditional Maximum Likelihood Estimation (MLE) based approaches aim to maximize the likelihood objective, which is non-convex due to latent states. While non-convex optimization methods like EM can learn models that locally optimize the likelihood objective, using the locally optimal model for an inference task such as Bayesian filtering usually does not have performance guarantees. In this work, we propose a method that considers the inference procedure on the dynamical system as a composition of predictors. Instead of optimizing a given parametrization of latent states, we learn predictors for inference in predictive belief space, where we can use sufficient features of observations for supervision of our learning algorithm. We further show that our algorithm, the Predictive State Inference Machine, has theoretical performance guarantees on the inference task. Empirical verification across several of dynamical system benchmarks ranging from a simulated helicopter to recorded telemetry traces from a robot showcase the abilities of training Inference Machines.
    BibTeX:
    @article{Sun15arxiv,
      author    = {Wen Sun and Arun Venkatraman and Byron Boots and J. Andrew Bagnell},
      title     = {Learning to Filter with Predictive State Inference Machines},
      journal   = {CoRR},
      volume    = {abs/1512.08836},
      year      = {2015},
      url       = {http://arxiv.org/abs/1512.08836}
    }
    
    X. Yan, V. Indelman, & B. Boots. Incremental Sparse GP Regression for Continuous-time Trajectory Estimation & Mapping. 2015 Technical Report arXiv:1504.02696
    Abstract: Recent work on simultaneous trajectory estimation and mapping (STEAM) for mobile robots has found success by representing the trajectory as a Gaussian process. Gaussian processes can represent a continuous-time trajectory, elegantly handle asynchronous and sparse measurements, and allow the robot to query the trajectory to recover its estimated position at any time of interest. A major drawback of this approach is that STEAM is formulated as a batch estimation problem. In this paper we provide the critical extensions necessary to transform the existing batch algorithm into an extremely efficient incremental algorithm. In particular, we are able to vastly speed up the solution time through efficient variable reordering and incremental sparse updates, which we believe will greatly increase the practicality of Gaussian process methods for robot mapping and localization. Finally, we demonstrate the approach and its advantages on both synthetic and real datasets.
    BibTeX:
    @article{Yan15arxiv,
      author    = {Xinyan Yan and Vadim Indelman and Byron Boots},
      title     = {Incremental Sparse {GP} Regression for Continuous-time Trajectory Estimation {\&} Mapping},
      journal   = {CoRR},
      volume    = {abs/1504.02696},
      year      = {2015},
      url       = {http://arxiv.org/abs/1504.02696}
    }
    
    B. Boots & G. J. Gordon. A Spectral Learning Approach to Range-Only SLAM. 2012 Technical Report arXiv:1207.2491
    Abstract: We present a novel spectral learning algorithm for simultaneous localization and mapping (SLAM) from range data with known correspondences. This algorithm is an instance of a general spectral system identification framework, from which it inherits several desirable properties, including statistical consistency and no local optima. Compared with popular batch optimization or multiple-hypothesis tracking (MHT) methods for range-only SLAM, our spectral approach offers guaranteed low computational requirements and good tracking performance. Compared with popular extended Kalman filter (EKF) or extended information filter (EIF) approaches, and many MHT ones, our approach does not need to linearize a transition or measurement model; such linearizations can cause severe errors in EKFs and EIFs, and to a lesser extent MHT, particularly for the highly non-Gaussian posteriors encountered in range-only SLAM. We provide a theoretical analysis of our method, including finite-sample error bounds. Finally, we demonstrate on a real-world robotic SLAM problem that our algorithm is not only theoretically justified, but works well in practice: in a comparison of multiple methods, the lowest errors come from a combination of our algorithm with batch optimization, but our method alone produces nearly as good a result at far lower computational cost.
    BibTeX:
    @article{Boots11arxiv,
      author = "Byron Boot and Geoffrey J. Gordon",
      year = {2012},
      title = "A Spectral Learning Approach to Range-Only SLAM",
      journal = {http://arxiv.org/abs/1207.2491}
    }
    
    B. Boots & G. J. Gordon. Two-Manifold Problems. 2011 Technical Report arXiv:1112.6399
    Abstract: Recently, there has been much interest in spectral approaches to learning manifolds---so-called kernel eigenmap methods. These methods have had some successes, but their applicability is limited because they are not robust to noise. To address this limitation, we look at two-manifold problems, in which we simultaneously reconstruct two related manifolds, each representing a different view of the same data. By solving these interconnected learning problems together and allowing information to flow between them, two-manifold algorithms are able to succeed where a non-integrated approach would fail: each view allows us to suppress noise in the other, reducing bias in the same way that an instrumental variable allows us to remove bias in a {linear} dimensionality reduction problem. We propose a class of algorithms for two-manifold problems, based on spectral decomposition of cross-covariance operators in Hilbert space. Finally, we discuss situations where two-manifold problems are useful, and demonstrate that solving a two-manifold problem can aid in learning a nonlinear dynamical system from limited data.
    BibTeX:
    @article{Boots11arxiv,
      author = "Byron Boot and Geoffrey J. Gordon",
      year = {2011},
      title = "Two-Manifold Problems",
      journal = {http://arxiv.org/abs/1112.6399}
    }
    
    B. Boots & G. J. Gordon. Predictive State Temporal Difference Learning. 2010 Technical Report arXiv:1011.0041
    Abstract: We propose a new approach to value function approximation which combines linear temporal difference reinforcement learning with subspace identification. In practical applications, reinforcement learning (RL) is complicated by the fact that state is either high-dimensional or partially observable. Therefore, RL methods are designed to work with features of state rather than state itself, and the success or failure of learning is often determined by the suitability of the selected features. By comparison, subspace identification (SSID) methods are designed to select a feature set which preserves as much information as possible about state. In this paper we connect the two approaches, looking at the problem of reinforcement learning with a large set of features, each of which may only be marginally useful for value function approximation. We introduce a new algorithm for this situation, called Predictive State Temporal Difference (PSTD) learning. As in SSID for predictive state representations, PSTD finds a linear compression operator that projects a large set of features down to a small set that preserves the maximum amount of predictive information. As in RL, PSTD then uses a Bellman recursion to estimate a value function. We discuss the connection between PSTD and prior approaches in RL and SSID. We prove that PSTD is statistically consistent, perform several experiments that illustrate its properties, and demonstrate its potential on a difficult optimal stopping problem.
    BibTeX:
    @article{Boots10arxiv,
      author = "Byron Boot and Geoffrey J. Gordon",
      year = {2010},
      title = "Predictive State Temporal Difference Learning",
      journal = {http://arxiv.org/abs/1011.0041}
    }
    
    B. Boots, S. M. Siddiqi & G. J. Gordon. Closing the Learning-Planning Loop with Predictive State Representations. (Updated: 2010) 2009 Technical Report arXiv:0912.2385
    Abstract: A central problem in artificial intelligence is that of planning to maximize future reward under uncertainty in a partially observable environment. In this paper we propose and demonstrate a novel algorithm which accurately learns a model of such an environment directly from sequences of action-observation pairs. We then close the loop from observations to actions by planning in the learned model and recovering a policy which is near-optimal in the original environment. Specifically, we present an efficient and statistically consistent spectral algorithm for learning the parameters of a Predictive State Representation (PSR). We demonstrate the algorithm by learning a model of a simulated high-dimensional, vision-based mobile robot planning task, and then perform approximate point-based planning in the learned PSR. Analysis of our results shows that the algorithm learns a state space which efficiently captures the essential features of the environment. This representation allows accurate prediction with a small number of parameters, and enables successful and efficient planning.
    BibTeX:
    @article{Boots09arxiv,
      author = "Byron Boots and Sajid M. Siddiqi and Geoffrey J. Gordon",
      year = {2009},
      title = "Closing the Learning-Planning Loop with Predictive State Representations",
      journal = {http://arxiv.org/abs/0912.2385}
    }
    
    S. M. Siddiqi, B. Boots & G. J. Gordon. Reduced-Rank Hidden Markov Models. 2009 Technical Report arXiv:0910.0902
    Abstract: We introduce the Reduced-Rank Hidden Markov Model (RR-HMM), a generalization of HMMs that can model smooth state evolution as in Linear Dynamical Systems (LDSs) as well as non-log-concave predictive distributions as in continuous-observation HMMs. RR-HMMs assume an m-dimensional latent state and n discrete observations, with a transition matrix of rank k <= m. This implies the dynamics evolve in a k-dimensional subspace, while the shape of the set of predictive distributions is determined by m. Latent state belief is represented with a k-dimensional state vector and inference is carried out entirely in R^k, making RR-HMMs as computationally efficient as k-state HMMs yet more expressive. To learn RR-HMMs, we relax the assumptions of a recently proposed spectral learning algorithm for HMMs and apply it to learn k-dimensional observable representations of rank-k RR-HMMs. The algorithm is consistent and free of local optima, and we extend its performance guarantees to cover the RR-HMM case. We show how this algorithm can be used in conjunction with a kernel density estimator to efficiently model high-dimensional multivariate continuous data. We also relax the assumption that single observations are sufficient to disambiguate state, and extend the algorithm accordingly. Experiments on synthetic data and a toy video, as well as on a difficult robot vision modeling problem, yield accurate models that compare favorably with standard alternatives in simulation quality and prediction capability.
    BibTeX:
    @article{siddiqi09arxiv,
      author = "Sajid M. Siddiqi and Byron Boots and Geoffrey J. Gordon",
      year = "2009",
      title = "Reduced-Rank Hidden {M}arkov Models",
      journal = "http://arxiv.org/abs/0910.0902" 
    }
    S. M. Siddiqi, B. Boots & G. J. Gordon. A Constraint Generation Approach to Learning Stable Linear Dynamical Systems. 2008 Technical Report
    CMU-ML-08-101  
    Abstract: Stability is a desirable characteristic for linear dynamical systems, but it is often ignored by algorithms that learn these systems from data. We propose a novel method for learning stable linear dynamical systems: we formulate an approximation of the problem as a convex program, start with a solution to a relaxed version of the program, and incrementally add constraints to improve stability. Rather than continuing to generate constraints until we reach a feasible solution, we test stability at each step; because the convex program is only an approximation of the desired problem, this early stopping rule can yield a higher-quality solution. We apply our algorithm to the task of learning dynamic textures from image sequences as well as to modeling biosurveillance drug-sales data. The constraint generation approach leads to noticeable improvement in the quality of simulated sequences. We compare our method to those of Lacy and Bernstein, with positive results in terms of accuracy, quality of simulated sequences, and efficiency.
    BibTeX:
    @techreport{Siddiqi08,
      author = "Sajid Siddiqi and Byron Boots and Geoffrey J. Gordon",
      title = "A Constraint Generation Approach to Learning Stable Linear Dynamical Systems",
      institution = {Carnegie Mellon University},
      number = {CMU-ML-08-101},
      year = {2008}
    }
    
    E. Chown & B. Boots Learning in Cognitive Maps: Finding Useful Structure in an Uncertain World. 2008 Robot and Cognitive Approaches to Spatial Mapping  
    Abstract: In this chapter we will describe the central mechanisms that influence how people learn about large-scale space. We will focus particularly on how these mechanisms enable people to effectively cope with both the uncertainty inherent in a constantly changing world and also with the high information content of natural environments. The major lessons are that humans get by with a “less is more” approach to building structure, and that they are able to quickly adapt to environmental changes thanks to a range of general purpose mechanisms. By looking at abstract principles, instead of concrete implementation details, it is shown that the study of human learning can provide valuable lessons for robotics. Finally, these issues are discussed in the context of an implementation on a mobile robot.
    BibTeX:
    @incollection{Chown2008,
      author = "Eric Chown and Byron Boots",
      title = "Learning Cognitive Maps: Finding Useful Structure in an Uncertain World",
      booktitle = "Robot and Cognitive Approaches to Spatial Mapping",
      editor = "Margaret E. Jefferies and Wai-Kiang Yeap",
      publisher = {Springer Verlag},
      year = {2008},
      chapter = {10},  
      pages ={215--236}
    }
    

    Theses
    Author Title Year Degree & Institution
    B. Boots Spectral Approaches to Learning Predictive Representations. 2012 Doctoral Thesis
    Carnegie Mellon University  
    Abstract: A central problem in artificial intelligence is to choose actions to maximize reward in a partially observable, uncertain environment. To do so, we must obtain an accurate environment model, and then plan to maximize reward. However, for complex domains, specifying a model by hand can be a time consuming process. This motivates an alternative approach: learning a model directly from observations. Unfortunately, learning algorithms often recover a model that is too inaccurate to support planning or too large and complex for planning to succeed; or, they require excessive prior domain knowledge or fail to provide guarantees such as statistical consistency. To address this gap, we propose spectral subspace identification algorithms which provably learn compact, accurate, predictive models of partially observable dynamical systems directly from sequences of action-observation pairs. Our research agenda includes several variations of this general approach: spectral methods for classical models like Kalman filters and hidden Markov models, batch algorithms and online algorithms, and kernel-based algorithms for learning models in high- and infinite-dimensional feature spaces. All of these approaches share a common framework: the model's belief space is represented as predictions of observable quantities and spectral algorithms are applied to learn the model parameters. Unlike the popular EM algorithm, spectral learning algorithms are statistically consistent, computationally efficient, and easy to implement using established matrix-algebra techniques. We evaluate our learning algorithms on a series of prediction and planning tasks involving simulated data and real robotic systems.
    BibTeX:
    @phdthesis{BootsThesis2012,
      author = "Byron Boots",
      title = "Spectral Approaches to Learning Predictive Representations",
      school = "Carnegie Mellon University",
      year = {2012},
      month = {December},
    }
    
    B. Boots Spectral Approaches to Learning Predictive Representations: Thesis Proposal 2011 Thesis Proposal
    Carnegie Mellon University  
    Abstract: A central problem in artificial intelligence is to choose actions to maximize reward in a partially observable, uncertain environment.
    To do so, we must obtain an accurate environment model, and then plan to maximize reward. However, for complex domains, specifying a model by hand can be a time consuming process. This motivates an alternative approach: learning a model directly from observations. Unfortunately, learning algorithms often recover a model that is too inaccurate to support planning or too large and complex for planning to succeed; or, they require excessive prior domain knowledge or fail to provide guarantees such as statistical consistency.

    To address this gap, we propose spectral subspace identification algorithms which provably learn compact, accurate, predictive models of partially observable dynamical systems directly from sequences of action-observation pairs. Our research agenda includes several
    variations of this general approach: batch algorithms and online algorithms, kernel-based algorithms for learning models in high- and infinite-dimensional feature spaces, and manifold-based identification algorithms. All of these approaches share a common framework: they are statistically consistent, computationally efficient, and easy to implement using established matrix-algebra techniques. Additionally, we show that our framework generalizes a variety of successful spectral learning algorithms in diverse areas, including the identification of Hidden Markov Models, recovering structure from motion, and discovering manifold embeddings. We will evaluate our learning algorithms on a series of prediction and planning tasks involving simulated data and real robotic systems.

    We anticipate several difficulties while moving from smaller problems and synthetic problems to larger practical applications. The first is the challenge of scaling learning algorithms up to the higher-dimensional state spaces that more complex tasks require. The second is the problem of integrating expert knowledge into the learning procedure. The third is the problem of properly accounting for actions and exploration in controlled systems. We believe that overcoming these remaining difficulties will allow our models to capture the essential features of an environment, predict future observations well, and enable successful planning.
    B. Boots Learning Stable Linear Dynamical Systems. 2009 M.S. Machine Learning
    Carnegie Mellon University  
    Abstract: Stability is a desirable characteristic for linear dynamical systems, but it is often ignored by algorithms that learn these systems from data. We propose a novel method for learning stable linear dynamical systems: we formulate an approximation of the problem as a convex program, start with a solution to a relaxed version of the program, and incrementally add constraints to improve stability. Rather than continuing to generate constraints until we reach a feasible solution, we test stability at each step; because the convex program is only an approximation of the desired problem, this early stopping rule can yield a higher-quality solution. We employ both maximum likelihood and subspace ID methods to the problem of learning dynamical systems with exogenous inputs directly from data. Our algorithm is applied to a variety of problems including the tasks of learning dynamic textures from image sequences, learning a model of laser and vision sensor data from a mobile robot, learning stable baseline models for drug-sales data in the biosurveillance domain, and learning a model to predict sunspot data over time. We compare the constraint generation approach to learning stable dynamical systems to the best previous stable algorithms (Lacy and Bernstein, 2002, 2003), with positive results in terms of prediction accuracy, quality of simulated sequences, and computational efficiency.
    BibTeX:
    @MastersThesis{Boots:Thesis:2009,
      author = "Byron Boots",
      title = "Learning Stable Linear Dynamical Systems",
      school = "Carnegie Mellon University",
      year = {2009},
      month = {May},
    }
    
    B. Boots Robot Localization and Abstract Mapping in Dynamic Environments. 2003 B.A. Computer Science
    Bowdoin College  
    Abstract: In this chapter we will describe the central mechanisms that influence how people learn about large-scale space. We will focus particularly on how these mechanisms enable people to effectively cope with both the uncertainty inherent in a constantly changing world and also with the high information content of natural environments. The major lessons are that humans get by with a “less is more” approach to building structure, and that they are able to quickly adapt to environmental changes thanks to a range of general purpose mechanisms. By looking at abstract principles, instead of concrete implementation details, it is shown that the study of human learning can provide valuable lessons for robotics. Finally, these issues are discussed in the context of an implementation on a mobile robot.
    BibTeX:
    techreport{Boots2003b,
      author = "Byron Boots",
      title = "Robot Localization and Abstract Mapping in Dynamic Environments",
      institution = {Bowdoin College},
      year = {2003}
    }
    
    B. Boots Chunking: A Modified Dynamic Programming Approach to Solving Stochastic Satisfiability Problems Efficiently. 2003 B.A. Computer Science
    Bowdoin College  
    Abstract: The best general stochastic satisfiability solvers systematically evaluate all possible variable assignments, using heuristics to prune assignments whenever possible. The idea of chunking differs in that it breaks the solution to stochastic satisfiability problems into pieces, amounting to a modified dynamic programming approach. The benefit of this approach, as compared with the straightforward application of dynamic programming, is that the saved solutions to the problem pieces are partial solutions and thus reusable in multiple situations.
    BibTeX:
    techreport{Boots2003a,
      author = "Byron Boots",
      title = "Chunking: A Modified Dynamic Programming Approach to 
      Solving Stochastic Satisfiability Problems Efficiently",
      institution = {Bowdoin College},
      year = {2003}
    }