I'm a 4-th year Ph.D. student at Georgia Tech advised by Prof. Devi Parikh, and also work closely with Prof. Dhruv Batra. My research interests lie at the intersection of computer vision and natural language processing. My primary research for now is about vision&language modeling using deep learning.

Prior to the Ph.D. study, I received M.S. degree in Computer Science from University at Buffalo and worked with Prof. Jason Corso. I earned my Bachelor degree from Nanjing University of Post and Telecommunications, Nanjing, China. Here is my full CV.

Email: jiasenlu at gatech dot edu


News

The pytorch beta version code for our ViLBERT paper has been released!
Our paper on Vision and Lanugage Navigation is accepted by ICLR2019!
Our paper Visual Curiosity is accepted by CoRL 2018 as an oral presentation!
One paper is accepted by ECCV 2018!
One paper is accepted by CVPR 2018!
The pytorch version code for our NIPS 2017 paper has been released!
One paper is accepted by NIPS 2017!
The torch version code for our CVPR 2017 paper has been released!
One paper is accepted by CVPR 2017! Source code on the way!
I will join Facebook AI Research for an internship in Spring 2017!
One paper is accepted by NIPS 2016!
I joined MetaMind for an internship in Summer 2016, working with Dr. Richard Socher and Dr. Caiming Xiong .
The torch version code for our arXiv paper Hiearchical Co-Attention has been released!
Our new work on visual question answering is posted at arXiv, source code on the way!

Publications

ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks
Jiasen Lu, Dhruv Batra, Devi Parikh, Stefan Lee
NeurIPS, 2019
[pdf] [code]
Emergence of Compositional Language with Deep Generational Transmission.
Michael Cogswell, Jiasen Lu, Stefan Lee, Devi Parikh, Dhruv Batra
arXiv:1904.09067, 2019
[pdf]
Self-Monitoring Navigation Agent via Auxiliary Progress Estimation.
Chih-Yao Ma, Jiasen Lu, Zuxuan Wu, Ghassan AlRegib, Zsolt Kira, Richard Socher, Caiming Xiong
International Conference on Learning Representations (ICLR), 2019
[pdf] [code]
Visual Curiosity: Learning to Ask Questions to Learn Visual Recognition.
(Oral Presentation)
Jianwei Yang*, Jiasen Lu*, Stefan Lee, Dhruv Batra, Devi Parikh. (* = equal contribution)
Conference on Robot Learning (CoRL) 2018.
[pdf]
Graph R-CNN for Scene Graph Generation.
Jianwei Yang*, Jiasen Lu*, Stefan Lee, Dhruv Batra, Devi Parikh. (* = equal contribution)
European Conference on Computer Vision (ECCV) 2018.
[pdf] [code] [poster]
Neural Baby Talk
Jiasen Lu*, Jianwei Yang*, Dhruv Batra, Devi Parikh. (* = equal contribution)
(Spotlight Presentation)
Computer Vision and Pattern Recognition (CVPR), 2018
[pdf] [code]
Best of Both Worlds: Transferring Knowledge from Discriminative Learning to a Generative Visual Dialog Model
Jiasen Lu, Anitha Kannan, Jianwei Yang, Devi Parikh, Dhruv Batra
Neural Inforamtion Processing Systems (NIPS) 2017.
[pdf] [code]
ParlAI: A Dialog Research Software Platform
Alexander H. Miller, Will Feng, Adam Fisch, Jiasen Lu, Dhruv Batra, Antoine Bordes, Devi Parikh, Jason Weston
Conference on Empirical Methods on Natural Language Processing (EMNLP), 2017
[pdf] [project]
Knowing When to Look: Adaptive Attention via A Visual Sentinel for Image Captioning.
(Spotlight Presentation)
Jiasen Lu*, Caiming Xiong*, Devi Parikh, Richard Socher. (* = equal contribution)
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017
[pdf] [more visualization demo] [code] [splotlight talk]
VQA: Visual Question Answering.
Aishwarya Agrawal*, Jiasen Lu*, Stanislaw Antol*, Margaret Mitchell, C. Lawrence Zitnick, Devi Parikh, Dhruv Batra.
International Journel of Computer Vision (IJCV)
[pdf] [code] [project page]
Hierarchical Question-Image Co-Attention for Visual Question Answering.
Jiasen Lu, Jianwei Yang, Dhruv Batra, Devi Parikh.
Neural Inforamtion Processing Systems (NIPS) 2016.
[pdf] [code]
VQA: Visual Question Answering.
Stanislaw Antol*, Aishwarya Agrawal*, Jiasen Lu, Margaret Mitchell, Dhruv Batra, C. Lawrence Zitnick, Devi Parikh.
International Conference on Computer Vision (ICCV), 2015
[pdf] [code] [project page]
Human Action Segmentation with Hierarchical Supervoxel Consistency.
Jiasen Lu, Ran Xu and Jason J. Corso
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015
[pdf] [code]
Improving Word Representations via Global Visual Context.
Ran Xu, Jiasen Lu, Caiming Xiong, Zhi Yang, Jason J. Corso.
NIPS workshop on Learning Semantics, 2014
[pdf] [code]

Projects [github]

Implement deeper LSTM and normalized CNN Visual Question Answering model.

Implement (Convolutional) Deep Structured Semantic Model in Torch, with a new Sparse Linear and Sparse Temporal Convolution layer.


Activities

Reviewer for European Conference on Computer Vision (ECCV) 2018
Reviewer for International Conference on Machine Learning (ICML) 2018
Reviewer for Conference on Computer Vision and Pattern Recognition (CVPR) 2017, 2018
Reviewer for International Conference on Learning Representations (ICLR) 2018
Reviewer for International Conference on Computer Vision (ICCV) 2017
Reviewer for Neural Information Processing Systems (NIPS) 2016, 2017, 2018
Reviewer for IEEE Transactions on Pattern Analysis and Machine Intelligence
Reviewer for IEEE Transactions on Multimedia
Organizer for VQA2.0 Challenge Workshop at CVPR 2017
Organizer for VQA Challenge Workshop at CVPR 2016
Student Volunteer for IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2015