General Information
Details
I work at the intersection of computer architecture, compilers, and systems, with a focus on heterogeneous computing platforms such as GPUs and emerging accelerator-based systems. My research develops architectures, compiler/runtime techniques, and performance modeling to improve efficiency, programmability, and reliability for modern workloads, including machine learning and data-intensive applications. Recent efforts include open-source GPU hardware/software stacks (e.g., RISC-V–based GPUs), GPU simulation/modeling, and system support for new memory and near-data processing ideas.
I teach and develop courses in computer architecture and systems, including processor design and advanced microarchitecture, and I enjoy mentoring students through hands-on, project-driven learning. I am especially interested in teaching GPU architecture and programming stacks, performance evaluation/modeling, and compiler/runtime techniques for heterogeneous systems, connecting core principles to modern ML and accelerator platforms.
Inside VOLT: Designing an Open-Source GPU Compiler,” Compiler Construction (CC), 2026. (10.1145/3771775.3786275)
Scaling GPU-to-CPU Migration for Efficient Distributed Execution on CPU Clusters,” PPoPP, 2026. (10.1145/3774934.3786435)
Swift and Trustworthy Large-Scale GPU Simulation with Fine-Grained Error Modeling and Hierarchical Clustering,” MICRO, 2025 (10.1145/3725843.3757107)
SparseWeaver: Converting Sparse Operations as Dense Operations on GPUs for Graph Workloads,” HPCA, 2025 (10.1109/HPCA61900.2025.00108.)
Understanding Performance Implications of LLM Inference on CPU,” IISWC, 2024. (10.1109/IISWC63097.2024.00024)