At Georgia Tech
My main area of current research is word embeddings
, exploring (a) how sub-word information can be used to create better word representations, as well as predict representations for words unseen during training; and (b) how to incorporate semantic analysis into embeddings so that they exhibit better similarity properties.
2017: Yuval Pinter, Robert Guthrie, Jacob Eisenstein. Mimicking Word Embeddings using Subword RNNs. Proceedings of EMNLP. PDF. Slides. Code.
Most of my research at Yahoo involved analysis of the web search process, specifically within the domain of Community Question Answering sites (like Yahoo Answers
, etc.), from both an Information Retrieval (IR) perspective and a linguistic / NLP-y one.
2016: Yuval Pinter, Roi Reichart, Idan Szpektor. Syntactic Parsing of Web Queries with Question Intent. Proceedings of NAACL-HLT. PDF. Talk. Data Release Notes. Data available to researchers via the Webscope program.
2016: Gilad Tsur, Yuval Pinter, Idan Szpektor, David Carmel. Identifying Web Queries with Question Intent. Proceedings of WWW. PDF.
2014: David Carmel, Avihai Mejer, Yuval Pinter, Idan Szpektor. Improving Term Weighting for Community Question Answering Search using Syntactic Analysis. Proceedings of CIKM. PDF. Blog post.
Our team, with collaborators from Academia and Government, started the TREC LiveQA Challenge which has so far been run three times at considerable success. In short: participants write a server which has to answer real questions coming into the Yahoo Answers feed, within 1 minute, for one entire day. Details in the overview papers:
2016: Eugene Agichtein, David Carmel, Dan Pelleg, Yuval Pinter, Donna Harman. Overview of the TREC 2016 LiveQA Track. PDF.
2015: Eugene Agichtein, David Carmel, Donna Harman, Dan Pelleg, Yuval Pinter. Overview of the TREC 2015 LiveQA Track. PDF. Blog post.
Code for creating a server.
Code for running a challenge.
In 2015, I pursued a hypothesis regarding Israeli media bias: Is it possible to predict which news outlet a headline is from, based on relatively few examples and using only simple textual features?
The answer, in so many words, is yes. I presented my preliminary results in:
2015: Yuval Pinter, Oren Persico, Shuki Tausig. Exploring Israeli News Website Bias using Simple Textual Analysis. ISCOL. Extended Abstract (PDF). PPTX (Hebrew).
I have not since followed up on this (but I'm still planning to). The code and data are available here
Since 2001 I've been looking into the various suggestions to reform the Hebrew script. Here's one of my latest installations, and here's a video of a talk I gave about the subject in 2011. All in Hebrew.
I have some other things up my sleeve. Once I get them into presentable form I'll add them here - stay tuned.
Days of Linguistics
A paper I co-authored, about the dynamics of group and authority in Hebrew language communities on online editorial projects, is available online:
2018: Carmel Vaisman, Illan Gonen, and Yuval Pinter. Nonhuman Language Agents in Online Collaborative Communities: Comparing Hebrew Wikipedia and Facebook Translations. Discourse, Context & Media. Link.
This work began in 2011 when I contributed to a chapter on the topic to the book Hebrew Online
by Carmel Vaisman and Illan Gonen. It's in Hebrew, and you can buy it here
from the publisher.
Dr. Vaisman and I gave a talk about this topic at the 2012 meeting of the Israeli Association for the Study of Language and Society. Video
My MA thesis from 2014, about the semantics of nonveridical BEFORE, is available here (PDF).
I presented parts of it at IGDAL (the first Israeli ling grad student conf) in 2011 with these slides that go right-to-left in each row. It's not a good summary for the completed thesis though.
In 2010 I presented a modification for graded modality comparison. Slides. Seminar Paper this was based on.
Much smaller-scale explorations, for class projects and seminars: Hypercorrection in Hebrew conjunctions; Linguistic exploration of Text messages in Hebrew (Hebrew); Comparing usage of the Hebrew few/little (PPT, Hebrew); Can the acceptance rate of Hebrew neologisms be predicted using Optimality Theory? (tl;dr - NO.) (Hebrew).
To sum, here
's my Google Scholar page. My Erdős number is at most 4