zk15 | College of Computing

Zsolt

Kira

General Information

Email:

zkira@gatech.edu

Phone:

404-894-3152

Location - Building:

Coda

Location - Room:

S1181B

Roles:

Professor (any rank)

Primary Unit:

School of Interactive Computing

Website URL

https://faculty.cc.gatech.edu/~zk15/

Details

Degrees with subject and Postdoc Experience:

Degree Type

B.S.

Subject

Computer Engineering and Computer Science (dual)

Year

2002

Institution

University of Miami

Location

Miami, FL

Degree Type

M.S.

Subject

Computer Science

Year

2008

Institution

Georgia Institute of Technology

Location

Atlanta, GA

Degree Type

Ph.D.

Subject

Computer Science

Year

2010

Institution

Georgia Institute of Technology

Location

Atlanta, GA

Statement of Research Interests:

Our work lies at the intersection of machine learning and artificial intelligence for perception and robotics, focusing on generalization and robustness. Recent works include robust finetuning of vision-language models to preserve out-of-distribution generalization capabilities, open-world generalization, long-horizon RL, 3D processing, and fine-tuning of Multi-Modal Foundation Models into Vision-Language-Action models via supervised finetuning, reinforcement learning, and post-training.

Statement of Teaching Interests:

I teach the Deep Learning course, both on-campus and OMSCS. I also teach the Vision-Language Foundation Models course, covering fundamentals and the latest research advancements in multi-modal large language models.

Selection of recent research, scholarly, and creative activities:

M. Andrade, J. Cha, B. Ho, V. Srihari, K. Yadav, and Z. Kira
"Let's Think in Two Steps: Mitigating Agreement Bias in MLLMs with Self-Grounded Verification"
International Conference on Learning Representations (ICLR)
Also appeared in NeurIPS Workshop on Multi-Turn Interactions in Large Language Models (MT-LLM), 2025.

S. Halbe, J. Tian, K.J. Joseph, J.S. Smith, K. Stevo, V.N. Balasubramanian, and Z. Kira,
"Grounding Descriptions in Images informs Zero-Shot Visual Recognition"
IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2026.

G. Gupta, K. Yadav, Z. Kira, Y. Gal, and R. Aljundi,
"Memo: Training Memory-Efficient Embodied Agents with Reinforcement Learning"
International Conference on Neural Information Processing Systems (NeurIPS), 2025.

K. Yadav, Y. Ali, G. Gupta, Y. Gal, Z. Kira
"FindingDory: A Benchmark to Evaluate Memory in Embodied Agents"
NeurIPS Workshop on SPACE in Vision, Language, and Embodied AI (SpaVLE), 2025.

G. Chhablani, X. Ye, R. Grover, M.Z. Irshad, and Z. Kira
"EmbodiedSplat: Personalized Real-to-Sim-to-Real Navigation with Gaussian Splats from a Mobile Device"
International Conference on Computer Vision (ICCV), 2025.
Also apppeared in CVPR Embodied AI Workshop, 2025.

A. Szot, B. Mazoure, O. Attia, A. Timofeev, H. Agrawal, D. Hjelm, Z. Gan, Z. Kira, and A.T. Toshev
"From Multimodal LLMs to Generalist Embodied Agents: Methods and Lessons"
Conference on Computer Vision and Pattern Recognition (CVPR), 2025.

C. Huang*, B. Maneechotesuwan*, S. Chopra, and Z. Kira
"FRAMES-VQA: Benchmarking Fine-Tuning Robustness across Multi-Modal Shifts in Visual Question Answering"
Conference on Computer Vision and Pattern Recognition (CVPR), 2025.

C. Huang, J. Tian, B. Maneechotesuwan, S. Chopra, and Z. Kira
"Directional Gradient Projection for Robust Fine-tuning of Foundation Models"
International Conference on Learning Representations (ICLR), 2025.

College of Computing

Search

General Information

Details

News Feed

Georgia Institute of Technology