|
GVU Technical Report
Number: GIT-GVU-04-11
Title:
Robust Generative Subspace Modeling: The Subspace t Distribution
Authors:
Zia Khan,
Frank Dellaert
Abstract:
Linear latent variable models such as statistical factor analysis (SFA)
and probabilistic principal component analysis (PPCA) assume that the data
are distributed according to a multivariate Gaussian. A drawback of this
assumption is that parameter learning in these models is sensitive to
outliers in the training data. Approaches that rely on M-estimation have
been introduced to render principal component analysis (PCA) more robust
to outliers. M-estimation approaches assume the data are distributed
according to a density with heavier tails than a Gaussian. Yet, these
methods are limited in that they fail to define a probability model for
the data. Data cannot be generated from these models, and the normalized
probability of new data cannot evaluated. To address these limitations, we
describe a generative probability model that accounts for outliers. The
model is a linear latent variable model in which the marginal density over
the data is a multivariate t, a distribution with heavier tails than a
Gaussian. We present a computationally efficient expectation maximization
(EM) algorithm for estimating the model parameters, and compare our
approach with that of PPCA on both synthetic and real data sets.
Keywords:
Robust, principal component analysis, factor analysis, t distribution, latent variable model,
outliers
You can access this
technical report via: PDF
|