General Information
Details
My research lies at the intersection of Natural Language Processing and Machine Learning, with a focus on building language models that can reason effectively and act as reliable agents in the world. I contributed to some of the earliest work on fine-tuning language models using reinforcement learning. My current work aims to make these systems more capable, multilingual, and equitable so they work well across languages and cultures while also addressing emerging privacy risks that arise as language models become more powerful.
I teach courses on Natural Language Processing and Large Language Models, with an emphasis on combining foundational understanding with modern approaches. My goal is to help students understand not only how these models work, but how to reason about their capabilities, limitations, and societal impacts as they are deployed in real-world applications.
Do Vision-Language Models Respect Contextual Integrity in Location Disclosure?
Ruixin Yang, Ethan Mendes, Arthur Wang, James Hays, Sauvik Das, Wei Xu, Alan Ritter
ICLR 2026
Tree-based Dialogue Reinforced Policy Optimization for Red-Teaming Attacks
Ruohao Guo, Afshin Oroojlooy, Roshan Sridhar, Miguel Ballesteros, Alan Ritter, Dan Roth
ICLR 2026
Language Models can Self-Improve at State-Value Estimation for Better Search
Ethan Mendes, Alan Ritter
NeurIPS 2025
[Spotlight]
Probabilistic Reasoning with LLMs for Privacy Risk Estimation
Jonathan Zheng, Alan Ritter, Sauvik Das, Wei Xu
NeurIPS 2025
Balancing the Budget: Understanding Trade-offs Between Supervised and Preference-Based Finetuning
Mohit Raghavendra, Junmo Kang, Alan Ritter
ACL 2025
[SAC Award]
Having Beer after Prayer? Measuring Cultural Bias in Large Language Models
Tarek Naous, Michael J. Ryan, Alan Ritter, Wei Xu
Proceedings of ACL 2024
[Best Social Impact Paper Award]