Stephen
Mussmann

General Information

Email:
mussmann@gatech.edu
Phone:
404-894-3152
Location - Building:
KACB
Location - Room:
3320
Roles:
Professor (any rank)
Primary Unit:
School of Computer Science

Details

Degrees with subject and Postdoc Experience:
Degree Type
Postdoctoral Scholar
Subject
Computer Science & Engineering
Year
2021-2023
Institution
University of Washington
Location
Seattle, WA
Degree Type
Ph.D.
Subject
Computer Science
Year
2021
Institution
Stanford University
Location
Stanford, CA
Degree Type
B.S.
Subject
Mathematics, Computer Science, Statistics
Year
2015
Institution
Purdue University
Location
West Lafayette, IN
Statement of Research Interests:

My research focuses on data-centric machine learning, emphasizing the often-overlooked aspects of data such as sourcing, selection, annotation, and validation, which critically impact the reliability and usability of ML systems. I combine theoretical and experimental methods to develop conceptual insights with practical relevance. Within data-centric machine learning, I am especially interested in: (1) Active Learning and Experimental Design (methods to select data to collect supervision), and (2) Statistical aspects of data algorithms and data-centric ML (e.g., noise, domains, concept shift).

Statement of Teaching Interests:

My teaching interests are focused on machine learning, including introductory ML courses, theoretical ML courses, and data-centric ML. While I enjoy teaching technical concepts through definitions, results, and examples, I am most interested in providing students with a big picture understanding, which is challenging to gain from books and online resources. I find that adopting a growth mindset for myself is a necessary step to effective teaching, and I regularly solicit feedback on my methods in order to maximize student learning and engagement.

Selection of recent research, scholarly, and creative activities:

Myopic Bayesian Decision Theory for Batch Active Learning with Partial Batch Label Sampling
Kangping Hu, Stephen Mussmann
https://arxiv.org/abs/2510.09877

Sum Estimation via Vector Similarity Search
Stephen Mussmann, Mehul Smriti Raje, Kavya Tumkur, Oumayma Messoussi, Cyprien Hachem, Seby Jacob
https://arxiv.org/abs/2601.11765 

VOCALExplore: Pay-as-You-Go Video Data Exploration and Model Building
Maureen Daum, Enhao Zhang, Dong He, Stephen Mussmann, Brandon Haynes, Ranjay Krishna, Magdalena Balazinska
VLDB 2024
https://www.vldb.org/pvldb/vol16/p4188-daum.pdf 

LabelBench: A Comprehensive Framework for Benchmarking Adaptive Label-Efficient Learning 
Jifan Zhang*, Yifang Chen*, Gregory Canal, Arnav Das, Gantavya Bhatt, Stephen Mussmann, Yinglun Zhu, Simon Shaolei Du, Kevin Jamieson, Robert D Nowak
DMLR 2024
https://arxiv.org/pdf/2306.09910 

Constants Matter: The Performance Gains of Active Learning
Stephen Mussmann, Sanjoy Dasgupta
ICML, 2022
https://proceedings.mlr.press/v162/mussmann22a.html