Graph Algorithms and Batched Processing

Graphs, which in their simplest forms are vertices connected by edges, are widely used in scientific computing, machine learning and network science. This talk will use recent progresses on two well-studied graph problems, network flows and dynamic matchings, to discuss an approach for designing provably efficient graph algorithms. This approach revolves around the use of batched processing to combine graph theoretic approximations with iterative procedures that control the approximation errors.

I will start by describing how the study of vertex labeling problems and their duals, network flows, led to the study of residual graphs and convergences of successive approximations. I will then discuss a similar phenomenon in dynamic graph data structures, where vertex reductions allow the maintenance of large matchings in graphs undergoing edge insertions/deletions. A key idea at the core of these results is the replacement of natural online path finding schemes by ones that find batches of paths.

Fully Dynamic Spectral Vertex Sparsifiers and Applications

Problems arising from analyzing and understanding graph structures have motivated the development of many powerful tools for storing and compressing graphs and networks. As many networks undergo gradual changes, answering queries on such dynamic networks more efficiently than recomputing from scratch is a well-studied topic in data structures.

We study dynamic graph algorithms for maintaining spectral vertex sparsifiers with respect to a subset of terminal vertices. Such sparsifiers preserve key structures in spectral algorithms, including effective resistances (which can be viewed as a numerical generalization of connectivity), solutions to systems of Laplacian linear equations, and energy of electrical flows between terminal vertices. We give data structures that maintain these sparsifiers in sublinear time under insertions/deletions of edges, as well as terminal additions.

This primitive in turn leads to sublinear time data structures for key primitives in spectral graph algorithms, including ones at the core of the `Laplacian paradigm’ for designing graph optimization algorithms. In particular, we obtain O(m^{3/4}\epsilon^{-4}) time per query/update for effective resistances on unweighted graphs, and O(n^{11/12}\epsilon^{-5}) time per query/update for implicit access to linear system solutions, where \epsilon is the relative approximation accuracy.

The majority of the talk will focus on key components of this data structure: (1) an interpretation of vertex sparsifiers as a sum of random walks, (2) a suitable choice of terminals to keep these walks local, and (3) maintenance of local walks and numerical solutions. Potential avenues in generalizing these techniques to provide new building blocks for dynamic graph algorithms will also be discussed.

High Performance Linear System Solvers with Focus on Graph Laplacians

Over the past two decades there has been tremendous theoretical progress on efficient solvers for Symmetric Diagonally Dominant matrices, or graph Laplacians. The fast runtime guarantees of such solvers, which are now faster than sorting, in turn led to the Laplacian paradigm for designing graph algorithms.

This talk will overview recent and ongoing efforts towards implementing Laplacian solvers that are an order of magnitude faster than what’s currently available, and discuss ongoing efforts towards benchmarking Laplacian solvers. We will discuss ways of isolating the combinatorial, data structural, and numerical components, and observations made from such evaluations.

A Numerical Analysis Approach to Convex Optimization

In current convex optimization literature, there are significant gaps between algorithms that produce high accuracy (1+1/poly(n))-approximate solutions vs. algorithms that produce O(1)-approximate solutions for symmetrized special cases. This gap is reflected in the differences between interior point methods vs. (accelerated) gradient descent for regression problems, and between exact vs. approximate undirected max-flow.

In this talk, I will discuss generalizations of a fundamental building block in numerical analysis, preconditioned iterative methods, to convex functions that include p-norms. This leads to algorithms that converge to high accuracy solutions by crudely solving a sequence of symmetric residual problems. I will then briefly describe several recent and ongoing projects, including p-norm regression using m1/3 linear system solves, p-norm flow in undirected unweighted graphs in almost-linear time, and further improvements to the dependence on p in the runtime.

Almost-Linear-Time Algorithms for Markov Chains and New Spectral Primitives for Directed Graphs

We provide almost-linear time algorithms for computing various fundamental quantities associated with random walks on directed graphs, including the stationary distribution, personalized PageRank vectors, hitting times between vertices, and escape probabilities. Previous algorithms for these problems either invoke a general purpose linear system solver or depend polynomially on condition number related quantities such as mixing time or restart probabilities.

Our result is based up several tools that may provide useful for studies into directed spectral graph theory, including: (1) a reduction from solving linear systems in arbitrary strongly connected directed graphs to Eulerian graphs; (2) spectral approximations of directed Eulerian graphs and algorithms for producing such sparse approximations; (3) approximate squaring / Gaussian elimination on Laplacians of directed graphs.

Merging Continuous and Discrete Algorithms via Graph Laplacians

Algorithmic analyses of large matrices and networks are increasingly reliant on high-level primitives such as eigensolvers and optimization packages. Over the past three decades, the study of efficient solvers for a class of structured matrices, graph Laplacians, led to progresses on fundamental problems in scientific computing, combinatorial optimization, and data structures. In this talk, I will discuss key ideas and structures in hybrid algorithms for graph Laplacians, as well as the increasing reliance on sparsification and preconditioning in more recent developments. I will then describe efforts in bringing these results closer to practice, as well as several new theoretical tools tailored towards hybrid algorithmic frameworks motivated by studying graph Laplacians.

Lp Row Sampling by Lewis Weights

We give a generalization of statistical leverage scores to non-linear settings: Lp-norm Lewis weights. When 0 < p < 2, sampling the rows of an n-by-d matrix A by these weights gives A' with about dlogd rescaled rows of A such that ||Ax||p is close to ||A'x||p for all vectors x.

These weights can be computed iteratively, leading to input-sparsity time algorithms. We also give an elementary proof for the convergence of sampling by L1 Lewis weights that's arguably simpler than proofs of L2 matrix concentration bounds.