My research revolves around the design, analysis, and implementation of provably efficient algorithms and data structures, with focus on graphs. Graphs represent pairwise relations between vertices using edges, and are widely used in high-performance computing, machine learning, and network science. In many of the more recent applications, the steady increases in data sizes make the quadratic or higher running times of many existing algorithms cost-prohibitive. I have designed more efficient graph algorithms and data structures by drawing upon fundamental algorithmic concepts such as batched processing and recursion.

My representative results include:

These results led to further studies of key intermediate structures that include graph sparsification, generalized preconditioning, and wider clases of structured linear systems. They opened new avenues of exploration in scientific computing, parallel algorithms, and network science, and also motivated data-driven studies in collaboration with researchers working in high performance computing and complex networks.

My primary interests in the near future revolve around problems induced by practical problems on large graphs, with a focus on better understanding the algorithmic ideas that underlie these fruitful lines of investigation. Below I will go into the three lines of investigations outlined above in more details, while providing pointers to representative publications. The list of my publications in chronological order can be found on the Publications page.

Optimization on Graphs

There is a long history of new algorithmic tools developed through the study of optimization problems on graphs. Examples of such developments include the notion of polynomial time as a benchmark for efficiency, dynamic tree data structures, randomized algorithms, and more recently, the incorporation of numerical tools in combinatorial algorithms. Graphs capture many of the core difficulties in efficient algorithms because label-propagation based path finding is analogous to the propagation of values in local methods such as gradient descent.

I've sought to develop graph algorithms that tightly integrate continuous and discrete tools. This started with the use of numerical methods into inner loops of optimization routines, and led to the first nearly-linear time algorithm for approximating undirected max-flows and min-cuts.

Trying to improving the convergence rates of these algorithms (which would also then extend to directed graphs) led to the study of minimum Lp-norm flows. This formulation captures many well-studied graph problems: the p=1 case is shortest path, the p=∞ case is max-flow/min-cut, while the p=2 case corresponds to spectral algorithms. My collaborators and I gave new numerical methods for Lp-norm minimization that are analogs of iterative refinement and preconditioned iterative methods from scientific computing:

From an optimization perspective, these results exhibit a surprising phenomenon: Lp-norm minimization can be solved in ways similar to the p=2 case, which are equivalent to solving systems of linear equations. Combining these new iterative methods with graph embedding and partitioning tools led to almost-linear time algorithms for minimizing Lp-norm flows to high accuracy. Recently they were also used to give a faster algorithm for unit capacity max-flow / bipartite matching (the p=∞ case) that runs in m11/8 = m1.375 time. The iterative refinement algorithm has also been used to improve the iteration counts of high accuracy Lp-norm regression algorithms.

Dynamic Graph Data Structures

The gradual accumulation of updates is one of the most common sources of large data. Therefore, it is natural to study whether solutions to problems can be maintained dynamically in time sub-linear, or even poly-logarithmic, of the input sizes. Dynamic graph data structures are notoriously difficult: even maintaining connectivity of graphs undergoing edge insertions and deletions is a problem that has seen more than thirty years of progresses. I have actively worked on extending graph theoretic tools initially developed for the static setting to data structures that succinctly store and efficiently maintain dynamically changing graphs and matrices. My results on dynamic graph data structures include:

A major obstacle facing the design of such dyanmic data structures is that errors lost in dynamic settings tend to be irrecoverable. To address this, many of my results rely on vertex elimination routines that preserve the solutions of flow/cut problems. Such objects allow the reduction of a problem on the origianl graph to one on a smaller subset of vertices (known as the terminal vertices), which correspond to the ones involved in the updates and queries. Maintaining this smaller graph under the insertion of terminal vertices in turn leads to an overall sublinear time per update: queries can be directly answered on this smaller grpah due to its solution-preservation guarantees, while updates can also be performed on it due to the dynamic inclusion of the endpoints of the affected edges.

Linear Systems Solvers

The solution of linear systems is a problem of fundamental theoretical importance, and has applications in numerical mathematics, engineering, and science. Despite almost two thousand years of work on solving linear systems, there remain large performance gaps between linear systems solvers and more widely used primitives such as sparse matrix-vector multiplication, balanced search trees, and sorting. This is even the case for highly structured classes of matrices such as graph Laplacians, which are the matrices associated with electrical flows, spectral graph theory, and elliptic problems from scientific computing.

My collaborators and I gave the current best bounds for solving linear systems in undirected Laplacians:

At the core of these algorithms is the use of graph theory to bound the approximations between two graph Laplacian matrices, and the generation of sparser graphs that are numerically close to the original one. Here a variety of tools from combinatorial grpah theory, and more importantly, recursion, were used to resolve the chicken-and-egg issue of using solvers to compute appropriate sampling probabilties, and the need of such sampling probabilities to generate the sparse approximations needed in solvers.

The resulting better understanding of sparsification was also crucial for extending such solvers beyond undirected graph Laplacians.

Such extensions include to connection Laplacians and directed Laplacians. The former arises in signal processing, and has unitary matrices instead of scalar weights on edges; the latter is associated with random walks on directed graphs and irreversible Markov chains.

A recent result by Kyng and Zhang demonstrated that any linear system can be approximated by one with graph theoretic structures, albeit with multiple labels at each vertex. Motivated in part by such connections, my collaborators and I studied solvers for such structured linear systems with addtional geometric properties. Here we focused on the 3D case, which due to its connection with scientific computing, is arguably one of the most important cases of geometrically embedded linear systems:

The hardness of such systems in the absence of geometric structures means this direction can be viewed both as an attempt towards faster solvers for all linear systems, and as an avenue towards classifying the difficulties of structured numerical problems arising from scientific computing.