Learning Evaluation Metrics
Evaluation metrics are fundamental for comparing performance of information retrieval (IR) systems as well as for learning the ranking functions of those systems. This proposal is concerned with a principled approach for learning evaluation metrics using judgment data and user behavior data.
We focused on a widely-used IR evaluation metric: the Discounted Cumulated Gains (DCG). We consider it as a form of utility function which reflects the degree of satisfaction of the users, and explore machine learning methods to learn the evaluation metrics from
judgment data and user behavior data. Specifically, we investigate three main research topics:
- Developing optimization methods for learning DCG that can incorporate the degree of difference in pairwise comparison of ranking lists;
- Developing machine learning methods that can learn DCG for the more realistic scenarios where the relevance grades are not readily available; and
- Inferring user search goals from user-system interaction data.
We advocate the idea that evaluation metrics for IR systems should be designed in a data-driven fashion so as to better reflect the actual user satisfaction with the IR system. As a concrete example of this philosophy, we focus on learning the parameters (gain values and discount factors) used in (n)DCG, a popular IR evaluation metric. Unlike many previous approaches, we propose to learn those parameters through side-by-side comparison data that indicate the degree of preference of users of ranking of a search result set over an alternative ranking of the same result set.
- K. Zhou, Hongyuan Zha, G. Xue, F. Li and X. Li. Learning the Gain Values and Discount Factors in DCG, To appear in IEEE Transactions on Knowledge Discovery and Data Engineering, 2013.
Z. Lu, X. Yang, W. Lin, Hongyuan Zha and X. Chen.
Inferring User Image-Search Goals under the Implicit Guidance of Users.
To appear in IEEE Transactions on Circuits and Systems for Video Technology,
- Z. Lu, Hongyuan Zha, X. Yang, W. Lin and Z. Zheng. A New Algorithm for Inferring User Search Goals with Feedback Sessions. IEEE Transactions on Knowledge Discovery and Data Engineering 25(3): 502-513, 2013.
- S.H. Yang, A. Smola, B. Long, Hongyuan Zha and Y. Chang.
Friend or Frenemy? Predicting Signed Ties in Social Networks. SIGIR, 2012.
K. Zhou and Hongyuan Zha.
Learning Binary Codes for Collaborative Filtering. SIGKDD, 2012.
- S.H. Yang, S. Crain and Hongyuan Zha. Bridging the language gap: topic-level
adaptation for cross-domain knowledge transfer. AISTATS, 2011
The project has a number of broad impacts. Research results are expected to provide foundations for further research in evaluation metrics. Active collaborations with industry leaders in Web search will enable the resulting methods to have real impacts on search engine as well as large IR system performance improvements. Improving the quality of search results will have significant impacts on satisfying people's information needs as well as their quality of life in general. The set of research topics lies at the interface between information retrieval and machine learning applications and it provides an ideal setting for training undergraduate and graduate students in the emerging interdisciplinary field of Web of science and engineering research.
Search engine companies can benefit from tuing their revaluation metrics using the research supported by the grant. This can be done in two different ways: 1) the learned metrics can be used as objective functions to tune the ranking functions; and 2) the learned metrics can be used for production decisions.
This material is based upon work supported by the National Science Foundation under Grant No. 1049694.