I am a third-year Ph.D. student in Computer Science at Georgia Institute of Technology, working at SSLAB advised by Taesoo Kim. I am currently working on systems for graph neural networks with Anand Iyer. My research interests include systems for graph processing, big data analytics and distributed computing.
Georgia Institute of Technology
Aug. 2019-present
The Chinese University of Hong Kong (CUHK)
Aug. 2017-May. 2019
Sun Yat-Sen University (SYSU)
*A joint program offered by SYSU and CUHK.
Sep. 2015-Jun. 2017
Existing systems for processing static GNNs either do not support dynamic GNNs, or are inefficient in doing so. In this project, we are building a system that supports dynamic GNNs efficiently. Based on the observation that existing proposals for dynamic GNN architectures combine techniques for structural and temporal information encoding independently, we propose novel techniques that enable cross optimizations across various tasks such as traffic forecasting, anomaly detection, and epidemiological forecasting.
May. 2021-present
Current systems to process dynamic graphs show a number of problems in terms of throughput of updates to the graph structure, latency to reflect new graph updates in the algorithmic result, the storage space needed for the dynamic graph. To tackle these problems, we proposed Cytom ā a cell-based streaming graph processing engine for dynamic graphs ā which is based on a subgraph-centric graph representation using a cell-based abstraction. This approach effectively reduces the storage overhead of state-of-the-art systems and allows for a highly parallel process when updating the graph structure.
Jan. 2020-Jul. 2021
In this project, we modeled input program as a hierarchical data flow graph (HDFG) to perform a set of graph-based operations and transformations for automatic optimization and parallelization. Automatic type inference was enabled based on both static and dynamic analysis. We also extended these techniques to data analytics applications using Pandas library. Moreover, a set of optimization rules were designed and implemented that can rewrite input code snippets to be more efficient in terms of I/O performance, memory footprint as well as computation workload.
Jan. 2020-May. 2021
We proposed a system for serving complex ML pipelines within end-to-end latency constraints. My contribution in this project is container-based model development for providing environment and resource isolation across models.
Sep. 2019-Dec. 2019
We supported distributed online analytical processing on Husky, which is a general-purpose distributed computing system developed by the system lab at CUHK. Specifically, we used Husky platform to implement the "By-layer" cubing algorithm in Apache Kylin for building data cube in a distributed manner. We also speed up query processing by optimizing query parsing and execution process. Code repo: https://github.com/husky-team/husky-kylin.
May. 2018-Apr. 2019