SANGEETHA SESHADRI |
|
RESEARCH INTERESTS
Storage systems, distributed middleware overlay systems, distributed data stream
systems, especially techniques and architectures for improving availability,
performance and scalability of these systems.
|
PhD in Computer Science |
(expected graduation: Jan 2009) Fall 2004 onwards, Atlanta, GA |
M.Sc (Hons) MathematicsBirla Institute of Technology and Science (BITS) (GPA:3.61/4.00) |
2002 1997-2002, Pilani, India |
Bachelor of Engineering (Honors) in Computer ScienceBirla Institute of Technology and Science (BITS) (GPA:3.61/4.00) |
2002 1997-2002, Pilani, India |
|
Research Assistant, College of Computing, Georgia Tech |
Atlanta, GA |
|
|
Architectures and techniques for
improving availability of large scale storage systems and services.
Optimization techniques for large scale
distributed stream query processing services.
Information retrieval based routing
techniques in peer-to-peer systems for searching the deep web.
|
2005-2008 2005-2007 2004-2005 |
|
Summer Intern, IBM Almaden Research Center |
San Jose, CA |
|
|
Architectures for flexible and scalable recovery and state restoration in storage software. Enhancing storage system availability on multi-core platforms through recovery conscious scheduling. Improving availability and reliability in scale-out storage systems using hierarchical architectures. Analyzed clustering models for high-availability in scale-out storage systems.
|
Summer 2008 Summer 2007 Summer 2006 Summer 2005 |
|
Senior Applications Engineer, Oracle Corporation, IDC |
Hyderabad, India |
|
|
Member of the Oracle Exchange development team, responsible for online Catalog, Spot Purchase and XML transactions modules of this B2B product.
|
2002-2004 |
|
Intern, Motorola Inc. |
Bangalore, India |
|
|
Involved in design and coding of Resource parser for ENTITE (ENhanced Tool for Integrated Test Execution). |
Jan-Jun 2002 |
|
JOURNAL PUBLICATIONS
[1] Sangeetha Seshadri, Vibhore Kumar, Brian F. Cooper and Ling Liu. A Distributed Stream Query Optimization Framework Through Integrated Planning and Deployment. To appear in the IEEE Trans. On Parallel and Distributed Systems (TPDS).
[2] Sangeetha Seshadri, Ling Liu and Lawrence Chiu. Recovery scopes, recovery groups, and fine-grained recovery in enterprise storage controllers with multi-core processors. To appear in the IBM Systems Journal.
[3] Sangeetha Seshadri, Brian F. Cooper. Routing Queries through a Peer-to-Peer InfoBeacons Network Using Information Retrieval Techniques. In IEEE Trans. On Parallel and Distributed Systems (TPDS). 18(12): 1754-1765, 2007.
CONFERENCE PUBLICATIONS
[4] Sangeetha Seshadri, Lawrence Chiu and Ling Liu.
A Systematic Approach to System State Restoration during Storage Controller
Micro-Recovery. To appear in FAST 2009.
[5] Sangeetha Seshadri, Bhuvan Bamba, Brian F. Cooper, Vibhore Kumar,
Ling Liu, Karsten Schwan and Gong Zhang (in alphabetical order), "Grouping
Distributed Stream Query Services by Operator Similarity and Network Locality",
To Appear In the Proceedings of IEEE Congress on Services 2008 (SCC 2008),
Honolulu, Hawaii, USA, July 2008.
[6] Bhuvan Bamba, Sangeetha Seshadri and Ling Liu "Scaling Location-based
Services with Dynamically Composed Location Index", To Appear In the Proceedings
of IEEE International Conference on Services Computing (SCC 2008), Honolulu,
Hawaii, USA, July 2008.
[7] Sangeetha Seshadri, Lawrence Chiu, Cornel Constantinescu, Subhashini
Balachandran, Clem Dickey and Ling Liu. Enhancing Storage System Availability on
Multi-Core Architectures with Recovery-Conscious Scheduling. To Appear in the
6th USENIX Conference on File and Storage Technologies (FAST'08).
[8] Sangeetha Seshadri, Lawrence Chiu, Karan Gupta, Paul Muench, Ling Liu
and Brian F. Cooper. A Fault-Tolerant Middleware Architecture for
High-Availability Storage Services. IEEE International Conference on Services
Computing (SCC) 2007.
[9] Sangeetha Seshadri, Vibhore Kumar, Brian F. Cooper and Ling Liu.
Optimizing Multiple Queries in Distributed Stream Systems Using Hierarchical
Network Partitions. IEEE International Parallel & Distributed Processing
Symposium (IPDPS) 2007.
TECHNICAL REPORTS, WORKSHOP PAPERS AND DEMOS
[10] Sangeetha Seshadri, Vibhore Kumar and Brian F. Cooper. Optimizing Multiple Queries in Distributed Data Stream Systems. 2nd IEEE International Workshop on Networking Meets Database (NetDB), in conjunction with ICDE 2006.
[11] Vibhore Kumar, Brian F. Cooper, Greg Eisenhauer, Srihari Govindharaj, Chaitanya Karlekar, Mohamed Mansour, Karsten Schwan, Sangeetha Seshadri, Balasubramanian Seshasayee. Policy-Driven Autonomic Management in Enterprise-Scale Information Flows. 4th IEEE International Conference on Autonomic Computing ICAC-2007. (Demo)
[12] Sangeetha Seshadri, Brian F. Cooper, Ling Liu. CubeCache: Efficient and Scalable Processing of OLAP Aggregation Queries in a Peer-to-Peer Network. CERCS Technical Report. GIT-CERCS-07-12, 2005.
[13] Sangeetha Seshadri, Paul Muench, Lawrence Chiu and Karan Gupta. Cluster Models – An analysis of availability, complexity and scalability Trade-offs. Intern Report, IBM Almaden Research Center, August 2005.
UNDER SUBMISSION
[14] Sangeetha Seshadri, Brian Cooper, Vibhore Kumar, Ling Liu and Karsten Schwan. Scaling Distributed Stream Query Services with STREAMREUSE. In preparation.
[15] Gong Zhang, Ling Liu, Sangeetha Seshadri. Scaling Reliable Location-based Overlay Service. Under submission.
[16] Sangeetha Seshadri, Lawrence Chiu, Cornel Constantinescu, Subashini Balachandran, Clem Dickey, Ling Liu and Paul Muench. A Recovery-Conscious Framework for Fault Resilient Storage Systems. In preparation.
PATENTS
[17] "Improving Performance for Read Requests using Race-based Reads, Inverted Read Paths and Adaptive Read Request Routing." V. Hsu, S.Seshadri, L. Chiu. Disclosure submitted June 2008. Patent in progress.
[18] “Log(lock) Architecture for System State Restore during Thread-level Micro-Recovery”. S.Seshadri and L.Chiu. Disclosure submitted August 2008. Patent in progress.
TALKS
RESEARCH PROJECTS
Architectures and Techniques to Improve Availability in Large Scale Storage Systems (Aug 2005 onwards)Optimization Techniques for Large Scale Distributed Stream Query Processing Services
(Aug 2005 onwards)
This project addresses the problem of optimizing multiple distributed stream queries that are executing i
simultaneously in distributed data stream systems. A static query optimization approach of "plan, then deployment"
is inadequate for handling distributed queries involving multiple streams and node dynamics faced in distributed
data stream systems and applications. Thus, the selection of an optimal execution plan in such dynamic and networked
computing systems must consider operator ordering, reuse, network placement, and search space reduction. How do we
quickly choose efficient plans from the large space of possibilities while taking into consideration both
network and processing costs?
Next, it is observed that stream queries are typically processed by a selection of collaborative nodes and
often share similar stream filters (such as stream selection or stream projection filters). The ability to
reuse existing operators during query deployment, especially for long running queries, is critical to the performance and
scalability of a distributed stream query processing service. Concretely, we argue that by taking advantage of
opportunities to reuse the same distributed operators for multiple and different concurrent queries and intelligently
consolidate operator computation across multiple queries, we can reduce the cost of query deployment and minimize
duplicated in-network processing. The technical challenges of reuse in streaming systems include dealing with large
and time-varying workloads, dynamically exploiting similarities between queries and the runtime application of network
knowledge. We believe that an effective reuse approach to providing high performance and high scalability for distributed
stream query services should embody both network locality awareness and operator semantic awareness of
stream queries in reuse decisions.
InfoBeacons: Guiding Users to Internet Information Sources
(Aug 2004 - Aug 2005 )
The Internet provides a wealth of useful information in a vast number
of dynamic information sources, but it is difficult to determine which sources
are useful for a given query. Most existing techniques either require explicit
source cooperation (for example, by exporting data summaries), or build a
relatively static source characterization (for example, by assigning a topic to
the source). We present a system, called InfoBeacons that takes a different
approach: data and sources are left 'as is', and a peer-to-peer network of
beacons uses past query results to guide queries to sources, who do the actual
query processing. This approach has several advantages, including requiring
minimal changes to sources, tolerance of dynamism and heterogeneity, and the
ability to scale to large numbers of sources. We present the architecture of the
system, and discuss the advantages of our design. We then focus on how a beacon
can choose good sources for a query despite the loose coupling of beacons to
sources. Beacons cache responses to previous queries and adapt the cache to
changes at the source. The cache is then used to select good sources for future
queries. We discuss results from a detailed experimental study using our beacon
prototype which demonstrates that our loosely coupled approach is effective; a
beacon only has to contact sixty percent or less of the sources contacted by
existing, tightly coupled approaches, while providing results of equivalent or
better relevance to queries.
Clustering Models for High-Availability in Scale-out Storage Systems (May
2005 - Aug 2005)
Worked on the design and analysis of clustering
models for high-availability in scale-out storage systems. Analyzed flat and
hierarchical clustering models focusing on system availability, complexity and
scalability. Clustering is an obvious solution for maximizing availability and
performance and balancing load in storage systems. Most often, high-availability
and reliability is achieved through redundancy at software, hardware and network
levels. However, as system components increase only linearly, the system.s state
space increases exponentially, thereby making it very difficult to test and
verify the system. Potentially, these systems are more prone to failures arising
from software defects that escape undetected during testing and are also
difficult to reproduce. In this project we investigate the trade-offs between
system complexity and availability from the perspective of clustered storage
systems. We evaluate designs and policies that help us maximize availability
while minimizing the number of system-states.
CubeCache: Semantic Caching in Peer-to-Peer OLAP Networks (Aug 2004
onwards)
Peer-to-peer systems are an area of much recent
activity due to their cost-effectiveness, scalability and ability to distribute
the overhead of sharing and storing data and performing computations. These
characteristics make such systems ideal for OLAP query processing. In this
project, we describe the framework for a system that utilizes a peer-to-peer
network to efficiently process OLAP queries. Specifically, we consider the
problem of searching for data in such a network and present techniques to locate
data and perform query processing with load balancing.
PROFESSIONAL ACTIVITIES/AWARDS
- IBM Ph.D Scholarship 2007-2008
- Recipient of the K.C.Mahindra scholarship for doctoral studies
- Stood 3rd in the class of M.Sc Mathematics
- Awarded University Scholarship for all semesters of study
- Coordinator, Department of Mathematics, Apogee 2000
- External Reviewer (VLDB 2006; ICDE 2006; SIGMOD 2006, 2007; MobiQuitous 2007; IBM Systems Journal 2007; ACM TOIT
2008; PPNA 2008; ICDCS 2008;)
REFERENCES
Available on request.