Learning Evaluation Metrics    Nsf         Gatech

Project Information


Evaluation metrics are fundamental for comparing performance of information retrieval (IR) systems as well as for learning the ranking functions of those systems. This proposal is concerned with a principled approach for learning evaluation metrics using judgment data and user behavior data. We focused on a widely-used IR evaluation metric: the Discounted Cumulated Gains (DCG). We consider it as a form of utility function which reflects the degree of satisfaction of the users, and explore machine learning methods to learn the evaluation metrics from judgment data and user behavior data. Specifically, we investigate three main research topics:
  1. Developing optimization methods for learning DCG that can incorporate the degree of difference in pairwise comparison of ranking lists;
  2. Developing machine learning methods that can learn DCG for the more realistic scenarios where the relevance grades are not readily available; and
  3. Inferring user search goals from user-system interaction data.

Technical Details

We advocate the idea that evaluation metrics for IR systems should be designed in a data-driven fashion so as to better reflect the actual user satisfaction with the IR system. As a concrete example of this philosophy, we focus on learning the parameters (gain values and discount factors) used in (n)DCG, a popular IR evaluation metric. Unlike many previous approaches, we propose to learn those parameters through side-by-side comparison data that indicate the degree of preference of users of ranking of a search result set over an alternative ranking of the same result set.


Boarder Impact

The project has a number of broad impacts. Research results are expected to provide foundations for further research in evaluation metrics. Active collaborations with industry leaders in Web search will enable the resulting methods to have real impacts on search engine as well as large IR system performance improvements. Improving the quality of search results will have significant impacts on satisfying people's information needs as well as their quality of life in general. The set of research topics lies at the interface between information retrieval and machine learning applications and it provides an ideal setting for training undergraduate and graduate students in the emerging interdisciplinary field of Web of science and engineering research.

Search engine companies can benefit from tuing their revaluation metrics using the research supported by the grant. This can be done in two different ways: 1) the learned metrics can be used as objective functions to tune the ranking functions; and 2) the learned metrics can be used for production decisions.


This material is based upon work supported by the National Science Foundation under Grant No. 1049694.