Joy
Arulraj

General Information

Email:
arulraj@gatech.edu
Phone:
(404) 385-6362
Location - Building:
KACB
Location - Room:
3324
Roles:
Professor (any rank)
Primary Unit:
College of Computing

Details

Degrees with subject and Postdoc Experience:
Degree Type
Ph.D.
Subject
Computer Science
Year
2018
Institution
Carnegie Mellon University
Location
Pittsburgh, PA, USA
Degree Type
M.S.
Subject
Computer Sciences
Year
2013
Institution
University of Wisconsin, Madison
Location
Madison, WI, USA
Degree Type
B.E.
Subject
Computer Science and Engineering
Year
2011
Institution
College of Engineering, Guindy
Location
Chennai, India
Statement of Research Interests:

My research focuses on designing database systems that let users easily query unstructured data, especially video and large textbooks, and quickly turn it into reliable insights. I lead the development of EvaDB, an AI-relational DBMS for scalable video analytics that brings classic DBMS ideas (cost-based optimization, reuse, and physical design) to ML-heavy pipelines. EvaDB accelerates exploratory analytics by materializing and reusing expensive model outputs, optimizing deep model execution with accuracy/throughput model ensembles and chunk-level plan selection, assigning different models to different video segments while pruning the search space to keep optimization lightweight. More recently, I am developing TokenSmith, a local-first database system for course materials that lets students query textbooks, lecture slides, and notes and receive fast, cited answers directly on their own machines. TokenSmith treats RAG as a data-management problem, combining local LLMs with indexing, latency-aware retrieval, and semantic caching to optimize the end-to-end query processing pipeline while keeping latency low and preserving privacy. Looking ahead, I aim to build AI-native data systems that make unstructured data as easy to query as relational data, delivering interactive performance, grounded answers, and robust behavior. Code: https://github.com/georgia-tech-db/evadb, https://github.com/georgia-tech-db/tokensmith

Statement of Teaching Interests:

My teaching philosophy is to help students learn systems by building real systems. I design courses that move beyond conceptual familiarity to operational understanding, students should be able to reason about correctness, performance, and tradeoffs, and then validate those ideas in running code. To that end, I am developing a new two-part, programming-centric sequence on building database systems: Database System Implementation (CS 4420/6422) and Advanced Database System Implementation (CS 4423/6423). Across two semesters, students incrementally implement a complete pedagogical DBMS, an approach inspired by hands-on systems courses at institutions such as CMU, TU Munich, and MIT, but extended in scope across two semesters. A key innovation in this sequence is that all programming assignments are built around BuzzDB (https://github.com/jarulraj/buzzdb), an academic DBMS designed to provide immediate, actionable feedback while reinforcing professional engineering practices. Students develop and test core components such as the buffer management, indexing, query execution, logging and recovery, concurrency control, and query optimization. The assignments place strong emphasis on both correctness and performance: BuzzDB integrates automated unit testing and memory-safety debugging tools to help students learn disciplined systems development. I continually refresh course materials to reflect modern practice and to lower barriers for students entering from diverse backgrounds. This commitment to teaching and curriculum development has been recognized through the Class of 1969 Teaching Fellowship.

Selection of recent research, scholarly, and creative activities:

1. G. T. Kakkar, J. Cao, A. Sengupta, J. Arulraj, H. Kim. “Aero: Adaptive Query Processing of ML Queries.” ACM SIGMOD 2025.
2. R. Wu, P. Chunduri, A. Payani, X. Chu, J. Arulraj, K. Rong. “SketchQL: Video Moment Querying with a Visual Query Interface.” ACM SIGMOD 2024.
3. X. Liu, J. Arulraj, A. Orso. “A Framework for Inferring Properties of User-Defined Functions.” ICSE 2024.
4. J. Bang, G. Kakkar, S. Mitra, J. Arulraj. “SEIDEN: Revisiting Query Processing in Video Database Systems.” PVLDB 2023.
5. J. Arulraj, A. Pavlo. “Non-Volatile Memory Database Management Systems.” Synthesis Lectures on Data Management, 2019.