Web Service Discovery:
The past few years have witnessed great strides in the accessibility and manageability of vast amounts of Web data. In particular, the widespread adoption of general purpose search engines like Google and AllTheWeb has added a layer of organization to an otherwise unwieldy medium. Typically, a search engine is optimized to identify a ranked list of Web pages relevant to a user query. This page-centric view of the Web has proven immensely successful.
But with the rise of high-quality data intensive web services on the so-called Deep Web (or Hidden Web) and the emergence of the web services paradigm, there is a growing demand for a new class of queries optimized not on the page level, but on the more general service level. Rather than requesting the top-ranked Web pages containing a certain keyword, say ``autism'', a user may be more interested in service-centric queries. For example, a user familiar with PubMed may be interested in posing queries: "What data services offer similar content to PubMed?'', ``What data sources are more specialized (or more general) than PubMed?'', "Find any other BLAST data services that complement NCBI BLAST's coverage?".
This project focuses on a novel source-biased approach to automatically discovering and ranking relevant data intensive web services. It supports a service-centric view of the Web through source-biased probing and source-biased relevance detection and ranking metrics. Concretely, our approach is capable of answering source-centric queries of the form posed above by focusing on the nature and degree of the topical relevance of one service to others. This source-biased probing allows us to determine in very few interactions whether a target service is relevant to the source by probing the target with very focused probes. We introduce the biased focus metric to discover highly relevant data services, to measure relevance between services, and to rank target services, aiming at giving higher ranking to those with higher affinity to the source of bias. We demonstrate how a simple implementation of our biased-probing technique outperforms alternative methods.
This research is partially supported by NSF CNS, NSF CCR, NSF ITR, DoE SciDAC, DARPA, CERCS Research Grant, IBM Faculty Award, IBM SUR grant, HP Equipment Grant, and LLNL LDRD.
Any opinions, findings, and conclusions or recommendations expressed in the project material are those of the authors and do not necessarily reflect the views of the sponsors.