Enterprise Computing

  1. Work with modern enterprise applications, addressing their high fault tolerance needs. Set up an enterprise configuration using the apache/tomcat/mysql packages on a blade server platform. Then implement an online failure prediction model for this configuration, based on an offline algorithm developed in our prior research. Then use it to better model application performance/reliability and better direct request scheduling, using a modern failure injection tool. Contact ztcai@cc  for more information.
  2. A common benchmark in enterprise systems is the well-known Trade3 benchmark. It currently does not run on IBM's WebSphere version 6, however. Websphere is IBM's commercial offering for enterprise programming, offering much more extensive functionality than open source counterparts like JBOSS, for example. As part of this project, you will get to know Websphere, benchmarks like Trade3, and you will perform performance testing/benchmarking on modern enterprise blade servers. Contact mansour@cc for more information.
  3. Use WebSphere's ESB (Enterprise Service Bus) to build dynamic policy-driven request routing. Two backend servers implement different versions of the same service. A client transparently sends a request to the bus, the bus decides based on some policy+current metrics on the target service. Develop novel scheduling policies and compare them to other approaches. Work with the Trade3 benchmark or similar and with representative request traces to evaluate the approach. Contact mansour@cc for more information.

Virtualization

  1. The idea is to determine a good architecture for tapping into disk access in Xen.  Specifically, imagine a monitor that sits between the backend disk vbd driver and the actual disk driver.  This monitor would see raw disk reads, writes, etc.  The project would be to start  with this data flow and build a useful, generic data abstraction on top of it.  Ideally, the setup would be flexible enough to work with various filesystems, disk drivers, etc.  This would obviously fit into the domain of disk-based IDS (see some related URL references below).  But it could be used for other monitoring purposes as well (e.g., system management).  Criteria for a good architecture would include performance, flexibility (as defined above), and code simplicity.

    http://www.cs.wisc.edu/adsl/Publications/sds-per06.html

    http://www.pdl.cmu.edu/PDL-FTP/Secure/CMU-PDL-03-106_abs.html

    Contact: bryan@thepaynes.cc
  2. Performance benefits of guest state hints provided to the hypervisor. In this project, we look at a specific example, such as page table page state. Xen hypervisor uses shadow page tables to run unmodified guest OSes. The management of these shadow pages are done heuristically, specifically the freeing of shadow pages. Guest state hints should provide performance benefits. For more information, contact Himanshu Raj (rhim@cc).
  3. Add large page support to Xen. Due to certain restrictions required by writable page tables, current guest OSes cannot enjoy the benefits of large pages. One approach is to do away with large pages support in a guest OS at all, but the performance of certain applications, databases e.g, is bound to suffer. In the preliminary work, we want to demonstrate the performance hit of not using large pages and add some rudimentary large page support for applications specifically. This project can be continued to support large pages for guest OSes. For more information,  contact Himanshu Raj (rhim@cc).
  4. Running single address space guest OS and applications in a virtualized system. Different guests are already isolated from each other by hypervisor, and hence we would like to quantify the performance benefits one can obtain by removing the protection between guest OS and guest applications and running them in same address space. For more information, contact Himanshu Raj (rhim@cc).
  5. Building efficient para-virtual guest VMs for new VT-technology enabled machines from Intel. For more information, contact Himanshu  Raj (rhim@cc).
  6. Enhancing the performance of virtual network interface device driver. This project would include linux device driver development, network processor programming, and virtualization using Xen. For more details, contact Himanshu Raj (rhim@cc).
  7. Modern multicore  platforms with virtualization technology make it possible to run multiple domains (guest OSes) on the same machine at the same time. Explore the possiblity of contructing a RPC mechanism for communication between domains using the utitlities provided by the Xen Hypervisor (e.g., consider building on top of XenBus). Compare your solution with a traditional socket based solution, and investigate the related domain schueduling issues if possible. Contact rhim@cc

Middleware

Work with a new paradigm in middleware, for enterprise or for high performance systems: publish/subscribe middleware (also called event-based middleware). Improve the metadata services associated with such middleware, specifically, develop an alternative `format server' (i.e., metadata service) protocol using LDAP or DNS-based techniques. Then use it to create a highly survivable/fault tolerant/scalable server implementation. Contact eisen@cc for more detail.

Autonomic Control Engine for Distributed Science.  As highly adaptive, distributed science engines begin to couple multiple high-performance codes and analysis tools together, there is a need to provide some autonomic capability for adapting the scientific workflow to meet changing needs.  The role of the "rule engine" in such systems is to couple the scientifically relevant information specification (is the wire breaking now?  Has the interface moved?) with quality of service measures (is there sufficient bandwidth to sustain x framerate?).  This project aims to augment a pre-existing code base with a more robust rule engine and policy specification language, as well as implementing the required distributed, autonomic control. Contact mwolf@cc.

 Self-organizing Overlays.  In massively scalable machine environments such as the leadership class machines at Oak Ridge National Laboratory, the configuration of a topology for aggregation trees can be difficult.  Presupposing that the application knows the configuration of the nodes at compile time is not practical, and 10k+ nodes registering with a central configuration service is also not practical.  Using peer-to-peer and graph partitioning techniques, this project will be aimed at making a scalable, resource efficient publish/subscribe overlay which will self-configure in response to local compute, memory, and network resources.  Implementation will focus on a large computational science code being used at Oak Ridge National Laboratory. Contact mwolf@cc.

Self-discovering Overlays.  As a muti-component, distributed application starts up, there is a discovery problem associated with the  new component finding its appropriate sets of input, output, and control connections.  This project will be aimed at coupling a distributed directory service (DHT-like) directly into the data communication overlay, thereby eliminating additional services with their potential scalability problems.  Implementation will focus on a large computational science code being used at Oak Ridge National Laboratory. Contact mwolf@cc.

Special Project - SPAM->Source: A Middleware Solution to SPAM
Prevention
In Spring 2006, a team of 6210 students developed a new approach to spam prevention, based on the IFLOW middleware developed here at Georgia Tech. The purpose of this project is to experiment with the utility of this approach on a new network emulation infrastructure, termed VINI (see SIGCOMM 2006). Relevance for this class is not only derived from the fact that middleware is a key component of modern distributed systems, but also because of the fact that VINI attempts to virtualize some of the key resources used in distributed applications, which are the network and the nodes on which applications run. Contact vibhore@cc for more information. A good basis for this project is using the Nutch/Lucene benchmark developed by the opensource community. 

Power and Resource Management

The higher objective of the project is to apply DVS/DFS techniques to conserve energy in a decentralized event-based MANET system. This is done by detecting any slack in event delivery, and scaling down the voltage/frequency of the appropriate node(s), to minimize the slack. For this purpose, there needs to be set up a mechanism by which, the timing behavior of the event processing modules being run at each nodes is known. For instance, if the frequency is scaled down by 10%, how much of delay will be incurred in event processing. One way to do this is developing a low-overhead online monitoring mechanism for the modules, which studies the characteristics of the module based on the various performance monitoring counters available in modern CPUs (that measure instruction count, cache hit rate, etc.) and build a simple model that allows higher level decision algorithms to make predictions. The project involves developing such monitoring mechanisms that build accurate models of slack-vs-frequency characteristics of modules, incurring low overheads. Contact bala@cc for more detail.

Develop a project that works with future cellphone/portable devices. We are open to a wide range of ideas for using such devices, including localization (using attached GPS) and localized service access, power management, novel services using C- or Java-based middleware, and others. A Unix-based cellphone from Motorola may be available later this semester. Initially, PDA-like Linux-based systems are available for use. Contact rnathuji@cc or bala@cc for more information.

Kernel Programming

For students interested in low level communication issues. NIST has an implementation of the AODV routing protocol for MANETs, called Kernel-AODV. This is a Linux module, making heavy use of netfilter/iptables. The project involves porting this software to FreeBSD, making use of its network filtering capabilities via ipf/ipfw/netgraph. Contact bala@cc for more detail.

additional info