Jacob Eisenstein

I'm an Assistant Professor in the School of Interactive Computing at Georgia Tech, where I lead the Computational Linguistics Laboratory. I work on machine learning approaches to understanding human language. I'm especially interested in non-standard language, discourse, computational social science, and statistical machine learning.

publications | teaching | twitter | biographical sketch | code and data

Some recent publications

Part-of-speech tagging for historical English. Yang and Eisenstein. NAACL 2016.
A comparison of domain adaptation and normalization techniques for part-of-speech tagging on the Penn Parsed Corpus of Historical English.
A latent variable recurrent neural network for discourse relation language models. Ji, Haffari, and Eisenstein. NAACL 2016.
A generative neural model of text and shallow discourse relations, yielding state-of-the-art performance on relation prediction in the PDTB and Switchboard, as well as discourse-informed language models.
Audience-modulated variation in online social media. Pavalanathan and Eisenstein. American Speech, May 2015.
Twitter users modulate their use of non-standard lexical items by the intended audience, with non-standard language used most in strong local ties. [preprint]
Systematic Patterning in Phonologically-Motivated Orthographic Variation. Eisenstein. Journal of Sociolinguistics, April 2015.
Patterns of phonological variation find their way into writing at a surprisingly deep level, with written analogues of spoken variables reflecting syntactic and phonological conditioning. [preprint]
One Vector is Not Enough: Entity-Augmented Distributed Semantics for Discourse Relations. Ji and Eisenstein. TACL, volume 3 (2015).
A two-pass recursive neural network for identifying implicit discourse relations in the Penn Discourse Treebank. To be presented at EMNLP 2015.
“You're Mr. Lebowski, I'm The Dude”: Inducing address term formality in signed social networks. Krishnan and Eisenstein. NAACL 2015 (best student paper!).
Finding address terms in text, and inducing their formality in a joint probabilistic model of content and social network structure.
Unsupervised multi-domain adaptation with feature embeddings. Yang and Eisenstein. NAACL 2015.
A simple, effective approach to unsupervised domain adaptation in NLP applications. Also, the first algorithm for unsupervised domain adaptation across a space of many multi-attribute domains. [code]

Recent and upcoming presentations

  • Workshop on Data-Driven Approaches to Networks and Language (keynote), Lyon, May 2016.
  • AAAI Spring Symposium on Observational Studies through Social Media and Other Human-Generated Content, Stanford, March 2016.
  • Text as Data Speaker Series, NYU, February 2016.
  • Social Science Data Analytics Series, Michigan State University, January 2016. [slides]
  • LSA Workshop on "Preparing your Corpus for Archival Storage", January 2016.
  • Columbia University, IGERT Distinguished Speaker Series, November 2015.
  • Bloomberg, invited speaker, November 2015.
  • Text as Data Conference, NYU, October 2015.
  • TextLink Workshop: Identification and Annotation of Discourse Relations in Spoken Language (Invited Keynote), Saarbrucken, October 2015.
  • University of Copenhagen, September 2015.
  • Empirical Methods in Natural Language Processing, Lisbon, September 2015. Invited talk at the LSDSem workshop. [slides]
  • Jelinek Memorial Summer Workshop on Language Technology, Seattle, July-August 2015. [slides on discourse modeling]
  • NAACL, Denver, June 2015. Invited keynote at the SocialNLP workshop. [slides]
  • International Conference on Learning Representations, San Diego, May 2015

Professional service

  • Co-chair: 2016 EMNLP Workshop on NLP and Computational Social Science
  • Co-chair: 2014 ACL Workshop on Language Technologies and Computational Social Science
  • Co-chair: 2013-2015 Atlanta Workshop on Computational Social Science
  • Area chair: NAACL 2016, ACL 2014, EACL 2013
  • Student research workshop, faculty advisor: NAACL 2016
  • Student awards coordinator: ICML 2013
  • Student volunteer chair: NAACL 2013
  • Tutorial co-chair: NAACL 2012
  • Program committees (past 12 months): ACL, CONLL, EMNLP, ICML, NAACL, NIPS. All these venues are open-access (OA).
  • Editorial boards: Linguistic Issues in Language Technology (OA), Language Variation series at Language Science Press (OA)
  • Journal Reviewing: Communications of the ACM, Computational Linguistics (OA), Journal of Machine Learning Research (OA), Journal of Artificial Intelligence Research (OA), Machine Learning Journal, Transactions of the Association of Computational Linguistics (OA), Journal of the American Statistical Association, Language in Society, Proceedings of the National Academy of Science, Digital Scholarship in the Humanities, ...
All this reviewing takes a lot of time! Please don't be offended if I decline additional requests, particularly from non-OA venues.


School of Interactive Computing
Georgia Institute of Technology
85 Fifth St. NW
Atlanta, GA 30308
Admin: Cynthia Bryant, 404 894 3807
Most preferred: jacobe (at) gatech (dot) edu