|
Some Recent Publications
-
Sparse Additive Generative Models of Text. Eisenstein, Ahmed and Xing. Proceedings of ICML 2011.
-
We present a new generative model of text, based on the principle of sparse deviation from a background word distribution. This approach proves effective in supervised, unsupervised, and latent variable settings. [code][slides]
- Discovering Sociolinguistic Associations with Structured Sparsity. Eisenstein, Smith and Xing. Proceedings of ACL 2011.
-
The space of potential associations between term frequencies and demographic attributes is very large; we apply structured sparsity to identify key associations, revealing previously unknown relationships between language and demographics. [data]
[code]
-
A Latent Variable Model for Geographic Lexical Variation. Eisenstein, O'Connor, Smith and Xing. Proceedings of EMNLP 2010.
-
We build a model of how language use varies across the United States, using a new corpus of geo-tagged text from Twitter.
We can predict location of twitter users within 500 kilometers from text alone. [slides of related presentation at LSA 2011].
-
Learning Document-Level Semantic Properties from Free-text Annotations. Branavan, Chen, Eisenstein, and Barzilay.
Journal of Artificial Intelligence Research 34, 2009.
- Informal "keyphrase" annotations can be used to predict document-level semantics, by modeling the latent annotation paraphrase structure. The resulting system automatically generates pro/con lists from reviews of products and services.
-
Bayesian Unsupervised Topic Segmentation. Eisenstein and Barzilay. EMNLP 2008.
- A new method to segment text and speech transcripts into topically-coherent units, using both lexical cohesion and cue phrases. First paper to learn cue phrases without supervision, by combining them with cohesion in a generative Bayesian framework.
-
- Gestural Cohesion for Topic Segmentation. Eisenstein, Barzilay, and Davis. ACL 2008.
- Coherent discourse topics contain internally consistent gestural-forms, paralleling
a similar phenomenon in the distribution of lexical items. Automatically extracted gesture
features improve unsupervised topic segmentation on dialogues.
Contact
School of Interactive Computing
Georgia Institute of Technology
85 Fifth St. NW
Atlanta, GA 30308
jacobe g mail
|
|