RESEARCH PROJECTS
Architectures and Techniques to Improve Availability in Large Scale Storage Systems
(Aug 2005 onwards)
Enterprise storage systems are the foundations of most data centers today and extremely high
availability is expected as a basic requirement from these systems. With rapid and exponential
growth of digital information and the increasing popularity of multi-core architectures, the
demand for large scale storage systems of extremely high availability (moving close to 7 nines)
continues to grow. On the other hand, embedded storage software systems (controllers) are
becoming much more complex and difficult to test especially given concurrent development and
quality assurance processes and the fact that legacy systems are being adapted to newer hardware.
With software failures and bugs becoming an accepted fact, focusing on recovery and reducing time
to recovery has become essential in many modern storage systems today. In current system architectures,
even with redundant controllers, most microcode failures trigger system-wide recovery causing the
system to lose availability for at least a few seconds, and then wait for higher layers to redrive
the operation. This unavailability is visible to customers as service outage and will only increase
as the platform continues to grow using the legacy architecture. How can we improve the availability
of a highly concurrent storage controller and scale the recovery process without re-architecting legacy code?
Optimization Techniques for Large Scale Distributed Stream Query Processing Services
(Aug 2005 onwards)
This project addresses the problem of optimizing multiple distributed stream queries that are executing i
simultaneously in distributed data stream systems. A static query optimization approach of "plan, then deployment"
is inadequate for handling distributed queries involving multiple streams and node dynamics faced in distributed
data stream systems and applications. Thus, the selection of an optimal execution plan in such dynamic and networked
computing systems must consider operator ordering, reuse, network placement, and search space reduction. How do we
quickly choose efficient plans from the large space of possibilities while taking into consideration both
network and processing costs?
Next, it is observed that stream queries are typically processed by a selection of collaborative nodes and
often share similar stream filters (such as stream selection or stream projection filters). The ability to
reuse existing operators during query deployment, especially for long running queries, is critical to the performance and
scalability of a distributed stream query processing service. Concretely, we argue that by taking advantage of
opportunities to reuse the same distributed operators for multiple and different concurrent queries and intelligently
consolidate operator computation across multiple queries, we can reduce the cost of query deployment and minimize
duplicated in-network processing. The technical challenges of reuse in streaming systems include dealing with large
and time-varying workloads, dynamically exploiting similarities between queries and the runtime application of network
knowledge. We believe that an effective reuse approach to providing high performance and high scalability for distributed
stream query services should embody both network locality awareness and operator semantic awareness of
stream queries in reuse decisions.
InfoBeacons: Guiding Users to Internet Information Sources
(Aug 2004 - Aug 2005 )
The Internet provides a wealth of useful information in a vast number
of dynamic information sources, but it is difficult to determine which sources
are useful for a given query. Most existing techniques either require explicit
source cooperation (for example, by exporting data summaries), or build a
relatively static source characterization (for example, by assigning a topic to
the source). We present a system, called InfoBeacons that takes a different
approach: data and sources are left 'as is', and a peer-to-peer network of
beacons uses past query results to guide queries to sources, who do the actual
query processing. This approach has several advantages, including requiring
minimal changes to sources, tolerance of dynamism and heterogeneity, and the
ability to scale to large numbers of sources. We present the architecture of the
system, and discuss the advantages of our design. We then focus on how a beacon
can choose good sources for a query despite the loose coupling of beacons to
sources. Beacons cache responses to previous queries and adapt the cache to
changes at the source. The cache is then used to select good sources for future
queries. We discuss results from a detailed experimental study using our beacon
prototype which demonstrates that our loosely coupled approach is effective; a
beacon only has to contact sixty percent or less of the sources contacted by
existing, tightly coupled approaches, while providing results of equivalent or
better relevance to queries.
Clustering Models for High-Availability in Scale-out Storage Systems (May
2005 - Aug 2005)
Worked on the design and analysis of clustering
models for high-availability in scale-out storage systems. Analyzed flat and
hierarchical clustering models focusing on system availability, complexity and
scalability. Clustering is an obvious solution for maximizing availability and
performance and balancing load in storage systems. Most often, high-availability
and reliability is achieved through redundancy at software, hardware and network
levels. However, as system components increase only linearly, the system.s state
space increases exponentially, thereby making it very difficult to test and
verify the system. Potentially, these systems are more prone to failures arising
from software defects that escape undetected during testing and are also
difficult to reproduce. In this project we investigate the trade-offs between
system complexity and availability from the perspective of clustered storage
systems. We evaluate designs and policies that help us maximize availability
while minimizing the number of system-states.
CubeCache: Semantic Caching in Peer-to-Peer OLAP Networks (Aug 2004
onwards)
Peer-to-peer systems are an area of much recent
activity due to their cost-effectiveness, scalability and ability to distribute
the overhead of sharing and storing data and performing computations. These
characteristics make such systems ideal for OLAP query processing. In this
project, we describe the framework for a system that utilizes a peer-to-peer
network to efficiently process OLAP queries. Specifically, we consider the
problem of searching for data in such a network and present techniques to locate
data and perform query processing with load balancing.