Jacob Eisenstein

I'm an Assistant Professor in the School of Interactive Computing at Georgia Tech. I work on machine learning approaches to understanding human language. I'm especially interested in social media, discourse, non-verbal communication and unsupervised learning.

publications | teaching | dissertation | biographical sketch | code and data

Some recent publications

Sparse Additive Generative Models of Text. Eisenstein, Ahmed and Xing. Proceedings of ICML 2011.
We present a new generative model of text, based on the principle of sparse deviation from a background word distribution. This approach proves effective in supervised, unsupervised, and latent variable settings. [code][slides]
Discovering Sociolinguistic Associations with Structured Sparsity. Eisenstein, Smith and Xing. Proceedings of ACL 2011.
The space of potential associations between term frequencies and demographic attributes is very large; we apply structured sparsity to identify key associations, revealing previously unknown relationships between language and demographics. [data] [code]
A Latent Variable Model for Geographic Lexical Variation. Eisenstein, O'Connor, Smith and Xing. Proceedings of EMNLP 2010.
We build a model of how language use varies across the United States, using a new corpus of geo-tagged text from Twitter.
We can predict location of twitter users within 500 kilometers from text alone. [slides of related presentation at LSA 2011].
Learning Document-Level Semantic Properties from Free-text Annotations. Branavan, Chen, Eisenstein, and Barzilay. Journal of Artificial Intelligence Research 34, 2009.
Informal "keyphrase" annotations can be used to predict document-level semantics, by modeling the latent annotation paraphrase structure. The resulting system automatically generates pro/con lists from reviews of products and services.
Bayesian Unsupervised Topic Segmentation. Eisenstein and Barzilay. EMNLP 2008.
A new method to segment text and speech transcripts into topically-coherent units, using both lexical cohesion and cue phrases. First paper to learn cue phrases without supervision, by combining them with cohesion in a generative Bayesian framework.
Gestural Cohesion for Topic Segmentation. Eisenstein, Barzilay, and Davis. ACL 2008.
Coherent discourse topics contain internally consistent gestural-forms, paralleling a similar phenomenon in the distribution of lexical items. Automatically extracted gesture features improve unsupervised topic segmentation on dialogues.

Recent and upcoming talks

  • Lisbon Machine Learning Summer School, July 2012
  • University of Texas, May 2012
  • University of Maryland CLIP colloquium, May 2012
  • Emory University, May 2012
  • Columbia University, March 2012
  • University of Pittsburgh Linguistics Colloquium, November 2011
  • Johns Hopkins Center for Language and Speech Processing, October 2011

Professional service

  • Tutorial co-chair, NAACL 2012
  • Program committees (past 12 months): ACL, NAACL, EACL, EMNLP, WWW, ICML, NIPS, AISTATS
  • Journal Reviewing: Computational Linguistics, Journal of Machine Learning Research, Journal of Artificial Intelligence Research
  • (the above takes a lot of time! please don't be offended if I decline additional reviewing/PC requests.)

Contact

School of Interactive Computing
Georgia Institute of Technology
85 Fifth St. NW
Atlanta, GA 30308
jacobe g mail