My research interests span a relatively wide area, ranging from
Signal Processing to Matrix Analysis, and from Algorithms to Numerical
Optimization. In my work, the main focus is on deriving adaptive
algorithms for efficient representations of natural signals (e.g.,
images, speech sound), and on investigating their theoretical
properties (such as computational complexity and accuracy).
- Subgraph-Preconditioned Conjugate Gradient. A central
problem in robot navigation - simultaneous localization and mapping (SLAM) - can
be formulated as inference in a factor graph. We solve the associated least-squares
optimization via a preconditioned conjugate gradient of linear system, by choosing
a subgraph preconditioner guaranteed to better condition the problem. Our method
nontrivially extends and adapts recent algorithmic tools in theoretical
computer science, specifically low-stretch trees and ultrasparsifiers. Moreover,
to assess the quality of the preconditioners, we generalize existing
linear algebraic tools developed for subgraph preconditioning (specifically
support theory) to the case of general, non-scalar factor graphs.
- Convergence Behavior of some specialized Cellular Automata.
By recasting the newly proposed Active Mask image segmentation algorithm in a
cellular automata framework, we may facilitate the analysis and theoretical
guarantees on the convergence of this voting-based iterative algorithm to fixed-point
(zero-change) states. For examples and proofs, check out our
paper.
An earlier version appeared
in IEEE ICASSP 2010.
- Spike-based representations. For temporal signals, look
here for
details. Currently, I am studying the visual flavor of this approach,
that is, I am working on the derivation of an adaptive,
shiftable-kernel, highly sparse representation for images.
The main computational advantage of this approach lies in a very efficient implementation
of the Matching Pursuit algorithm used for spike extraction in the inference stage, and on
superfast optimization of the dictionary in the learning stage.
- Robustness and redundancy in linear representations.
"Robust Coding", or "how to cope with noise in the (linear)
representation by using more coding units". For more on this (and for nice
pictures), check out our paper in IEEE Trans. on
Image Processing. A very intuitive and natural characterization of the optimal code in
the two-dimensional case appeared in a preliminary version
in NIPS 2005.
- Speech Enhancement by Bandwidth Extension. Reconstructing speech with
noisy or missing spectral content appears in digital telephony or blind source separation
from mono- or multi-channel data. By modeling this problem as statistical inference with
missing data, we can restore altered speech frames based on similarity with clean ones.
Generality and speed are due to only using informative training examples in the partial
matching stage, while reconstruction quality both perceptual and quantitative (by the Itakura-Saito
criterion), is at least as good as that of specialized methods. For more details, look
here
and here.
- Wavelets, frames, filter-banks. See how to increase
coding efficiency of multiscale/multiresolution methods for a given
ensemble of images, by employing statistical adaptivity of the
representation. Here is our paper on Multiresolution ICA (MrICA) in IEEE ICASSP 2009.
Our combined multiresolution adaptive approach performs better than JPEG2000
on several classes of images, by learning a better set of features for each of
these.
- Algebraic Signal Processing Theory. For more on what
that is, please see this source.
If you want to see how to design alternative polynomial transforms that
asymptotically approach the DTFT (their spectrum converges to the unit
circle) but avoid the periodic signal extension/boundary condition,
see our paper. If you still want to see more,
check out this video of
me at the NIPS Workshop "Algebraic and combinatorial methods in machine learning".