My research revolves around the design, analysis, and implementation of provably efficient algorithms and data structures, with focus on graphs. Graphs represent pairwise relations between vertices using edges, and are widely used in high-performance computing, machine learning, and network science. In many of the more recent applications, the steady increases in data sizes make the quadratic or higher running times of many existing algorithms cost-prohibitive. I have designed more efficient graph algorithms and data structures by drawing upon fundamental algorithmic concepts such as batched processing and recursion.
My representative results include:
- New numerical tools for optimization problems on graphs and matrices, leading to some of the current best algorithms for network flows and regression problems.
- New tools for maintaining dynamically changing graphs, and the first data structures for maintaining matchings, network centralities, and edge-connectivities in fully dynamic graphs.
- The current best solvers for graph-structured linear systems, and some of the first sub-quadratic time algorithms for solving more general classes of linear systems.
My primary interests in the near future revolve around problems induced by practical problems on large graphs, with a focus on better understanding the algorithmic ideas that underlie these fruitful lines of investigation. Below I will go into the three lines of investigations outlined above in more details, while providing pointers to representative publications. The list of my publications in chronological order can be found on the Publications page.
Optimization on Graphs
There is a long history of new algorithmic tools developed through the study of optimization problems on graphs. Examples of such developments include the notion of polynomial time as a benchmark for efficiency, dynamic tree data structures, randomized algorithms, and more recently, the incorporation of numerical tools in combinatorial algorithms. Graphs capture many of the core difficulties in efficient algorithms because label-propagation based path finding is analogous to the propagation of values in local methods such as gradient descent.
I've sought to develop graph algorithms that tightly integrate continuous and discrete tools. This started with the use of numerical methods into inner loops of optimization routines, and led to the first nearly-linear time algorithm for approximating undirected max-flows and min-cuts.
- Jonathan A. Kelner, Gary L. Miller, and Richard Peng. Faster Approximate Multicommodity Flow Using Quadratically Coupled Flows. In STOC 2012. (arXiv)
- Gary L. Miller, and Richard Peng. Approximate Maximum Flow on Separable Undirected Graphs. In SODA 2013. (arXiv)
- Richard Peng. Approximate Undirected Maximum Flows in O(m polylog(n)) Time. In SODA 2016. (arXiv).
Trying to improving the convergence rates of these algorithms (which would also then extend to directed graphs) led to the study of minimum L_{p}-norm flows. This formulation captures many well-studied graph problems: the p=1 case is shortest path, the p=∞ case is max-flow/min-cut, while the p=2 case corresponds to spectral algorithms. My collaborators and I gave new numerical methods for L_{p}-norm minimization that are analogs of iterative refinement and preconditioned iterative methods from scientific computing:
- Deeksha Adil, Rasmus J. Kyng, Richard Peng, Sushant Sachdeva. Iterative Refinement for L_{p}-norm Regression. In SODA 2019. (arXiv)
- Rasmus J. Kyng, Richard Peng, Sushant Sachdeva, Di Wang. Flows in Almost Linear Time via Adaptive Preconditioning. In STOC 2019. (arXiv)
- Deeksha Adil, Richard Peng, Sushant Sachdeva. Iterative Refinement for L_{p}-norm Regression. In NeurIPS 2019. (arXiv)
Dynamic Graph Data Structures
The gradual accumulation of updates is one of the most common sources of large data. Therefore, it is natural to study whether solutions to problems can be maintained dynamically in time sub-linear, or even poly-logarithmic, of the input sizes. Dynamic graph data structures are notoriously difficult: even maintaining connectivity of graphs undergoing edge insertions and deletions is a problem that has seen more than thirty years of progresses. I have actively worked on extending graph theoretic tools initially developed for the static setting to data structures that succinctly store and efficiently maintain dynamically changing graphs and matrices. My results on dynamic graph data structures include:
- The first sub-linear time data structures for maintaining (1 + ε)-approximate maximum (weighted) matchings: Manoj Gupta, and Richard Peng. Fully Dynamic (1+ε)-Approximate Matchings. In FOCS 2013. (arXiv)
- A derandomized graph partitioning routine that led to the first worst case subpolynomial (n^{{o(1)}}) time data structure for maintaining dynamic connectivity and MSTs: Julia Chuzhoy, Yu Gao, Richard Peng, Jason Li, Danupon Nanongkai, Thatchaphol Saranurak. A Deterministic Algorithm for Balanced Cut with Applications to Dynamic Connectivity, Flows, and Beyond. (arXiv).
- The first sublinear-time data structure for maintaining (1+ε) approximations to spectral quantities such as edge centralities: David Durfee, Yu Gao, Gramoz Goranci, and Richard Peng. Fully Dynamic Vertex Spectral Sparsifiers and Applications. In STOC 2019. (arXiv)
- The first nearly-optimal offline data structure for computing c-edge-connectivities: Yang P. Liu, Richard Peng, Mark Sellke. Vertex Sparsifiers for c-Edge Connectivity. (arXiv).
A major obstacle facing the design of such dyanmic data structures is that errors lost in dynamic settings tend to be irrecoverable. To address this, many of my results rely on vertex elimination routines that preserve the solutions of flow/cut problems. Such objects allow the reduction of a problem on the origianl graph to one on a smaller subset of vertices (known as the terminal vertices), which correspond to the ones involved in the updates and queries. Maintaining this smaller graph under the insertion of terminal vertices in turn leads to an overall sublinear time per update: queries can be directly answered on this smaller grpah due to its solution-preservation guarantees, while updates can also be performed on it due to the dynamic inclusion of the endpoints of the affected edges.
Linear Systems Solvers
The solution of linear systems is a problem of fundamental theoretical importance, and has applications in numerical mathematics, engineering, and science. Despite almost two thousand years of work on solving linear systems, there remain large performance gaps between linear systems solvers and more widely used primitives such as sparse matrix-vector multiplication, balanced search trees, and sorting. This is even the case for highly structured classes of matrices such as graph Laplacians, which are the matrices associated with electrical flows, spectral graph theory, and elliptic problems from scientific computing.
My collaborators and I gave the current best bounds for solving linear systems in undirected Laplacians:
- Ioannis Koutis, Gary L. Miller, and Richard Peng. A Fast Solver for a Class of Linear Systems. In Communications of the ACM, Vol. 55 No. 10, pp. 99-107, 2012.
- Richard Peng, and Daniel A. Spielman. An Efficient Parallel Solver for SDD Linear Systems. In STOC 2014. (arXiv)
- Michael B. Cohen, Rasmus Kyng, Gary L. Miller, Jakub Pachocki, Richard Peng, Anup B. Rao and Shen Chen Xu. Solving SDD Linear Systems in Nearly mlog^{1/2}n Time. In STOC 2014.
- Kevin Deweese, John R. Gilbert, Gary L. Miller, Richard Peng, Hao Ran Xu, and Shen Chen Xu. An Empirical Study of Cycle Toggling Based Laplacian Solvers. In CSC 2016. (arXiv)
The resulting better understanding of sparsification was also crucial for extending such solvers beyond undirected graph Laplacians.
- Rasmus J. Kyng, Yin Tat Lee, Richard Peng, Sushant Sachdeva, Daniel A. Spielman. Sparsified Cholesky and Multigrid Solvers for Connection Laplacians. In STOC 2016. (arXiv).
- Michael B. Cohen, Jonathan A. Kelner, John Peebles, Richard Peng, Aaron Sidford, and Adrian Vladu. Faster Algorithms for Computing the Stationary Distribution, Simulating Random Walks, and More. In FOCS 2016. (arXiv).
- Michael B. Cohen, Jonathan A. Kelner, John Peebles, Richard Peng, Anup B. Rao. Aaron Sidford, and Adrian Vladu. Almost-Linear-Time Algorithms for Markov Chains and New Spectral Primitives for Directed Graphs. In STOC 2017. (arXiv).
- Michael B. Cohen, Jonathan A. Kelner, Rasmus J. Kyng, John Peebles, Richard Peng, Anup B. Rao, and Aaron Sidford. Solving Directed Laplacians in Nearly Linear Time through Sparse LU Factorizations. In FOCS 2018. (arXiv)
A recent result by Kyng and Zhang demonstrated that any linear system can be approximated by one with graph theoretic structures, albeit with multiple labels at each vertex. Motivated in part by such connections, my collaborators and I studied solvers for such structured linear systems with addtional geometric properties. Here we focused on the 3D case, which due to its connection with scientific computing, is arguably one of the most important cases of geometrically embedded linear systems:
- Michael Cohen, Brittany Terese Fasy, Gary L. Miller, Amir Nayyeri, Richard Peng, and Noel J. Walkington. Solving 1-Laplacians of Convex Simplicial Complexes in Nearly Linear Time: Collapsing and Expanding a Topological Ball. In SODA 2015. (conf)
- Rasmus J. Kyng, Richard Peng, Robert Schwieterman, Peng Zhang. Incomplete Nested Dissection. In STOC 2018. (arXiv).