At Georgia Tech
My main area of current research is word embeddings
, exploring (a) the challenges presented to text representation techniques by difficult lexical forms; (b) how sub-word information can be used to create better word representations, as well as predict representations for words unseen during training; and (c) how to incorporate semantic analysis into embeddings so that they exhibit better similarity properties.
2021: Elazar Gershuni and Yuval Pinter. Restoring Hebrew Diacritics Without a Dictionary. Preprint.
2020: Yuval Pinter, Cassandra L. Jacobs, Max Bittker. NYTWIT: A Dataset of Novel Words in the New York Times. COLING. PDF. Data.
2020: Yuval Pinter, Cassandra L. Jacobs, Jacob Eisenstein. Will it Unblend?. Findings of EMNLP. PDF. Video (lay audience).
2019: Nicolas Garneau, Jean-Samuel Leboeuf, Yuval Pinter, Luc Lamontagne. Attending Form and Context to Generate Specialized Out-of-Vocabulary Words Representations. Preprint.
2019: Yuval Pinter, Marc Marone, Jacob Eisenstein. Character Eyes: Seeing Language through Character-Level Taggers. Blackbox NLP Workshop. PDF. Slides. Code.
In June 2019 I gave a talk about this project at CUNY, as well as a (different) talk in December 2019 - February 2020 at Amazon Research, at the Tel Aviv University Machine Learning Seminar, and at AISC (video). Slides from the academic venues available upon request.
2018: Yuval Pinter and Jacob Eisenstein. Predicting Semantic Relations using Global Graph Properties. Proceedings of EMNLP. PDF. Blog post. Talk. Slides. Code.
In December 2018 I gave talks about this project in Israel, at: Technion - IIT, Ben-Gurion University, and Yahoo Research. Slides from the academic venues available upon request.
2017: Yuval Pinter, Robert Guthrie, Jacob Eisenstein. Mimicking Word Embeddings using Subword RNNs. Proceedings of EMNLP. PDF. Blog post. Talk. Slides. Code.
In December 2017 I gave talks about this project in Israel, at: Bar Ilan University, Technion - IIT, Amazon Research, and Google Research. Slides from the academic venues available upon request.
In addition to my main research projects, a critical look at work inspecting the ability of Attention mechanisms to explain model behavior got me into research on model interpretability:
2020: Sarthak Jain, Sarah Wiegreffe, Yuval Pinter, Byron Wallace. Learning to Faithfully Rationalize by Construction. Proceedings of ACL. PDF.
In May 2020 I gave a talk about this project "at" Stony Brook University's NLP seminar. Slides available upon request.
2019: Sarah Wiegreffe*, Yuval Pinter*. Attention is not not Explanation. Proceedings of EMNLP-IJCNLP. PDF. Blog posts: public facing; professional. Code. Slides. Talk (Sarah's).
The December 2019 - May 2020 talks mentioned above cover this work as well.
A replication project done for the graduate seminar in Computational Social Science yielded a paper about language selection based on political stance.
2018: Ian Stewart*, Yuval Pinter*, Jacob Eisenstein. Sí o no, ¿què penses? Catalonian Independence and Linguistic Identity on Social Media. Proceedings of NAACL-HLT. PDF. Code. Slides. Live Tweets.
I later presented this paper at AACL 2018.
Most of my research at Yahoo involved analysis of the web search process, specifically within the domain of Community Question Answering sites (like Yahoo Answers
, etc.), from both an Information Retrieval (IR) perspective and a linguistic / NLP-y one.
2016: Yuval Pinter, Roi Reichart, Idan Szpektor. Syntactic Parsing of Web Queries with Question Intent. Proceedings of NAACL-HLT. PDF. Blog post.
Talk. Data Release Notes. Data available to researchers via the Webscope program.
2016: Gilad Tsur, Yuval Pinter, Idan Szpektor, David Carmel. Identifying Web Queries with Question Intent. Proceedings of WWW. PDF.
2014: David Carmel, Avihai Mejer, Yuval Pinter, Idan Szpektor. Improving Term Weighting for Community Question Answering Search using Syntactic Analysis. Proceedings of CIKM. PDF. Blog post.
Our team, with collaborators from Academia and Government, started the TREC LiveQA Challenge which has so far been run three times at considerable success. In short: participants write a server which has to answer real questions coming into the Yahoo Answers feed, within 1 minute, for one entire day. Details in the overview papers:
2017: Asma Ben Abacha, Eugene Agichtein, Yuval Pinter, Dina Demner-Fushman. Overview of the Medical Question Answering Task at TREC 2017 LiveQA. PDF.
2016: Eugene Agichtein, David Carmel, Dan Pelleg, Yuval Pinter, Donna Harman. Overview of the TREC 2016 LiveQA Track. PDF.
2015: Eugene Agichtein, David Carmel, Donna Harman, Dan Pelleg, Yuval Pinter. Overview of the TREC 2015 LiveQA Track. PDF. Blog post.
Code for creating a server.
Code for running a challenge.
In 2015, I pursued a hypothesis regarding Israeli media bias: Is it possible to predict which news outlet a headline is from, based on relatively few examples and using only simple textual features?
The answer, in so many words, is yes. I presented my preliminary results in:
2015: Yuval Pinter, Oren Persico, Shuki Tausig. Exploring Israeli News Website Bias using Simple Textual Analysis. ISCOL. Extended Abstract (PDF). PPTX (Hebrew).
I have not since followed up on this (but I'm still planning to). The code and data are available here
Since 2001 I've been looking into the various suggestions to reform the Hebrew script. Here's one of my latest installations, and here's a video of a talk I gave about the subject in 2011. All in Hebrew.
I have some other things up my sleeve. Once I get them into presentable form I'll add them here - stay tuned.
Days of Linguistics
A paper I co-authored, about the dynamics of group and authority in Hebrew language communities on online editorial projects, is available online:
2018: Carmel Vaisman, Illan Gonen, and Yuval Pinter. Nonhuman Language Agents in Online Collaborative Communities: Comparing Hebrew Wikipedia and Facebook Translations. Discourse, Context & Media. Link.
This work began in 2011 when I contributed to a chapter on the topic to the book Hebrew Online
by Carmel Vaisman and Illan Gonen. It's in Hebrew, and you can buy it here
from the publisher.
Dr. Vaisman and I gave a talk about this topic at the 2012 meeting of the Israeli Association for the Study of Language and Society. Video
My MA thesis from 2014, about the semantics of nonveridical BEFORE, is available here (PDF).
I presented parts of it at IGDAL (the first Israeli ling grad student conf) in 2011 with these slides that go right-to-left in each row. It's not a good summary for the completed thesis though.
In 2010 I presented a modification for graded modality comparison. Slides. Seminar Paper this was based on.
Much smaller-scale explorations, for class projects and seminars: Hypercorrection in Hebrew conjunctions; Linguistic exploration of Text messages in Hebrew (Hebrew); Comparing usage of the Hebrew few/little (PPT, Hebrew); Can the acceptance rate of Hebrew neologisms be predicted using Optimality Theory? (tl;dr - NO.) (Hebrew).
To sum, here
's my Google Scholar page. My Erdős number is at most 4
. I haven't been in any feature films, but the TV-lenient version of the Bacon number puts mine also at most 4 (via
), So my Erdős-Bacon number is at most 8.