CS6210
Student Defined Projects

Guidelines
Proposals
Final Report
Facilities
Suggestions
Ideas

Guidelines

The purpose of a student-defined project is to permit you to do work that will (1) involve you in ongoing research projects, (2) permit you to leverage your unique background in some way, and/or (3) leverage other work in which you are involved. In general, any special project you propose should be of a caliber that can generate results publishable in reviewed outlets like workshops or conferences (typically requiring some additional work beyond the time spent in this class). You can propose your own project or select one of the projects described below.

Proposals

To propose your project for this class, you must submit the following materials:

The first step must include both development (e.g., coding, experimentation) and background work, such as producing a bibliography of relevant papers and having read them and having designed suitable algorithms/approaches and having learned or looked at suitable tools to be used for your project, including target platforms.

The second step, typically after the class midterm, should involve having produced much of the software necessary and having debugged it.

The third step should include not only software testing but also performance evaluation, on the platforms you have chosen. Such evaluations may include theoretical results if you chose to develop a novel algorithm, for instance, but your project must include experimental evaluation in addition to algorithm development.

The final deliverable not only includes the actual software but also a report, which is outlined next.

Final Report

The on-line final report regarding your project should have the following parts:

Facilities Available for CS6210 Projects

1. Linux `Hack' cluster: Cluster of up to eight 2-way Linux machines (physically part of the NetLab facility described next):

    * Ability to get superuser privileges to do kernel-level development.
    * Enforcement of real-time constraints possible through ability to have privileges akin to super-user
    * Ability to run on dedicated single (2-way shared memory multiprocessor) or multiple machines, networked
       via switched 100MB or Gigabit ethernet.
    * Ability to use experimental kernel-level real-time facilities including the DWCS scheduling used in first project.
    * Contact Sanjay Kumar (ksanjay) or Ivan Ganev (ganev@cc) for details on hardware/software setup.

2. NetLab cluster machine: cluster of over 40 machines, able to emulate arbitrary Internet topologies and able to run multiple operating systems.

3. High end cluster machines, one comprised of 30 dual Itanium IIs, the other comprised of 16 8-way Pentium IIIs. Contact Matt Wolf (mwolf@cc) for more information and machine access.

5. 'Enterprise' computing clusters: (1) a typical  3-tier setup using HP Itanium-based server systems, and (2) a second setup using an IBM blade server as a backend. Contact Neil Bright (ncb@cc) or Mohammed Mansour (mansour@cc) for more detail and/or for hardware access.

6. Compaq iPAQ handheld computers using wireless links and new xscale-based (Sitsang) machines:

    * Running Linux (multiple configurations of Linux are available and may be installed there). Both Java and C/C++ programming are possible.
    * Ability to use dedicated base station and/or use campus wireless network, via 11MB WaveLAN wireless ethernet cards.  
    * Ability to attach and use cameras.
    * Contact Matt Wolf for additional information.

8. Embedded communication subsystems, using Intel's IXP1200 or IXP2400 network routers:
    * Contact Ada Gavrilovska (ada@cc) or Ivan Ganev (ganev@cc) for more information on IXPs.

9. Large-scale visualization media, including an Immersadesk and a video wall (contact Matt Wolf for more information).

Concrete Project Suggestions

Graduate students who would interact with you and your team have proposed the following projects. They have been well defined and 'debugged' to be doable within the time constraints imposed by this semester. However, you must still define the intermediate project deadlines mentioned in the Special Project Guidelines. NOTE: you are not bound by the suggestions. You may interact with the instructor to define other projects.

1. Projects addressing operating system kernels:

* The Sitsang units have various attached devices, including a TI ADS7846N touch-screen controller. This controller also has a digital-to-analog (DAC) built into it, to monitor the power supply. In the Sitsang, it's connected to the regulated power line from the batteries. It can sample at 125kHz. A project around this potential is: 1) write the necessary infrastructure in the linux 7846 driver to monitor the sitsang power level at 125 kHz; 2) experiment with various workloads and observe how the power changes; 3) determine if inserting random delays at start-up and shut-down of processes can reduce the power signature overall; 4) adaptively adjust the CPU core power level with the onboard programmable supply to experiment with different mechanisms to exploit slack and save power; and 5) tie all of these features into the scheduler. Contact Josh Fryman (fryman@cc) or Ripal Nathuji (rnathuji@cc) for projects involving the Sitsang boards and online power management.

* There is an open question on the trade-off between communications and computation. If you are riding on MARTA, you have only the resources available in your PDA. When you walk into room 207, however, many devices are around you. using the Compact Flash (CF) adaptor and an 802.11 device, or the IrDA ports, you can auto- negotiate to offload some computation to desktop or cluster systems, to save power locally. However, in the process of doing so, your network usage goes up and the power can increase. The Sitsang board is remarkable in that every single component can be dynamically turned off and on - SDRAM, CF, IrDA, etc. The project steps would include: 1) devise a simple protocol for auto-negotiation network connections when you walk into a room; 2) measure the power signature by offloading some or all computation to remote systems, while meeting deadlines of applications and trasnferring finished results back to the PDA as needed (think remote graphics processing or MPEG decoding); 3) find the tradeoff in bandwidth and local computation based on transfer data (sizes, latencies, etc) across a range of benchmarks and needs; and 4) tie this information into the scheduler to permit auto- migration of workloads where appropriate. Contact Josh Fryman (fryman@cc).

* There is considerable industry interest in developing virtualization techniques and support (OS and hardware). The kernel 'plugin' mechanism developed at Georgia Tech implements lightweight methods for kernel extension. This mechanism  currently operates on Pentium class machines. You can do any number of projects involving plugins:  1) port them to other machines, like those based on Intel's new 64bit architecture or those based on Intel's Itanium 64bit architecture, 2) write applications that can take advantage of kernel extensions using plugins, such as a kernel-level proxy service currently being developed by Jiantao Kong (jiantao@cc) or kernel-level monitoring services being developed by Sandip Agarwala (sandip@cc), or  3) use plugins to implement virtualized devices like the virtualized  USB devices implemented by Sanjay Kumar (ksanjay@cc). Contact Ivan Ganev for more information (ganev@cc).

* Implement dynamic monitoring functionality on IXP boards. More generally, we are looking for someone interested in exploring how to dynamically place code onto available microengines on the IXP processor, where monitoring is an interesting application of this idea. Another interesting application is to place code there that interprets application-level message structures, based on the PBIO binary structure format developed by our group (see Greg Eisenhauer's web page). This work would be based on an available IXP programming infrastructure developed by Ada Gavrilovska and others (contact ada@cc).

* Implement other exciting application-level or OS-relevant (e.g., virtualization) functionality on IXP processors. One  specific idea is described next, but there are many other interesting things to try. Group synchronization for cluster servers: distributed applications often rely on multiple processing nodes collaborating and synchronously advancing their application' states. Many protocols have been defined which target the problem of reliable group synchronization, consistency across multiple cooperating nodes, synchronous checkpointing and recovery mechanism, etc. Optimistic Virtual Synchrony is one such algorithm. This project is a multi-phase evaluation of a group synchronization algorithm, particularly the Optimistic Virtual Synchrony (OVS) algorithm. The goal of the project is to implement the algorithm at several different levels, and compare and understand the trade-offs encountered at each level. Initially, you should implement the OVS algorithm at user level, for a cluster-wide server application. Next, you should move the user level implementation into the Linux kernel, and evaluate the performance implications of the kernel level implementation. In the second stage of the project, you will be experimenting with the programmable network boards, IXP1200s. Particularly, you will perform comparative performance analysis between the OVS implementation on the cluster nodes, and the IXP1200 resident StrongArm processors. You should perform predictive studies with the ixp1200s, and make conclusions about the performance implications of moving the synchronization mechanisms onto the network boards. The performance implications of moving the synchronization mechanisms onto the network boards. For more detail on the project, questions, documentation, and reference material, contact Ada Gavrilovska (ada@cc.gatech.edu).

* Work with a kernel-level, distributed infrastructure for quality and security management, termed Q-fabric. The idea is to extend per-machine functionality built in support of specific applications or guest operating systems  to multiple machines,  involving remote monitoring, remote management, and dynamic quality of service or trust management. An explicit application being built by our group is one that extends Wi-Fi access to remote machines in mobile settings, given that at least one machine is within range of a base station, using Q-fabric facilities and running a simple file exchange program (Mutella) on top (contact ganev@cc).

* Application adaptation using Hardware Performance Monitoring Counters: most modern processers have Performance Monitoring Counters (PMC) which can be used to measure low level events like cache misses, tlb misses, cpu cycles executed etc. The aim of this project is to do dynamic adaptation of distributed application(s) (for better performance, load balancing etc) using information provided by PMCs. Student may write their own application(s) or pick up some existing ones. Kernel modules to configure and read PMCs will be provided. Alternatively, develop a kernel-level network monitoring module: The module should be able to keep track of RTT, bandwidth, error rate and other important statistics of all open TCP connections with minimal overhead. In addition, it should have analysis functions that evaluate communication performance at runtime, then export such information to elsewhere in the kernel. For more information, contact sandip@cc.

2. Projects at the application level:

* Build a new, efficient version of the LDAP directory service. Specifically, use an existing, open source implementation of LDAP to create better directory services for distributed systems. The idea is to enhance ldap functionality with `activity', using as a basis a simple, proactive directory service (not implemented with LDAP) created by our group in earlier work. Contact Patrick Widener (pmw@cc) for more detail.

* Experiment/work with Microsoft's new .net infrastructure. Specifically, first use its SOAP remote invocation functionality, then enhance it to improve SOAP performance. Enhancements are possible along two directions: I) work with other students (contact bala@cc) to design a new implementation of the SOAP protocol based on an open source release associated with Apache (contact  sandip@cc for more information), then  (1) use it or (2) further extend its ability to deal with larger application data.

* A new application-level service of interest to a large user community is what is termed as an m-by-n data exchange, the idea being that m programs on up to m different machines interchange data with n programs on up to n machines. We have built such a service, but are looking for students who are interested in (1) developing applications that take advantage of it and/or (2) work on optimizations that use multicast network-level support. One idea is to further enhance and use an m-by-n GridFTP service, based on the open source grid FTP that's out there. Another idea is to develop a concurrent graphics service, with some graphics functions mapped to graphics boards attached to the machines running m-by-n services. Contact Hasan Abbasi (habbasi@cc) or Matt Wolf for more detail.

* Work with realistic enterprise system  infrastructures, such as IBM's Websphere, Open Source products like JBOSS, or SOAP implementations. Typical work with such infrastructures involves changing their key communication primitives (e.g., making RMI  network aware) or developing applications that require their rich functionality (contact mansour@cc),

* Industry developments in middleware involve three different support infrastructures, one being the Java-based approaches like Websphere, another centered around .net, and a third focused the publish/subscribe model of communication. Our research group has been developing a high performance pub/sub infrastructure, termed  ECho, which has recently been extended to make it easier to create efficient very large scale pub/sub applications. The extensions makes it easier to dynamically create overlay networks mapped to appropriate machines and network links. Students can perform a wide variety of projects, including (1) automatic generation of and management of certain pub/sub functionality (e.g., the operators applied to the events  traversing the overlay network) (contact lofstead@cc or vibhore@cc, the former experimenting with XML-based representations of such operators and the latter involved with database operator-like formulation; (2) algorithms to create and update overlay networks, based on network feedback (contact ztcai@cc) or based on other resource monitoring (contact sandip@cc) or even on formulations of  distributed trust (contact ramesh@cc); and (3) new applications that utilize the rich middleware functionality provided by the pub/sub model (contact the instructor).

* Benchmark the newest and niftiest machines out there. Use both standard benchmarks (including web server benchmarks) and realistic application programs (the latter not to be developed by you) to evaluate the newest machines made by manufacturers like Intel and HP. Contact Matt Wolf (mwolf@cc) for more information.
 

Ideas

The following constitute project ideas that you need to work out in more detail, including the submission of a detailed project proposal, as specified in the Special Project Guidelines. Projects that we know to be definitely doable within the time constraints imposed by this class are highlighted. Other project suggestions must be defined in more detail, which is part of your task when refining the project with your proposal.

Projects addressing operating system kernels:

* Modification of an existing virtual environment or multiplayer game (e.g., Quake), such that it uses a group communication tool (ECho/KECho) developed by our group. This modified application will then be used as a basis for experiments (e.g., filtering of events).

* Use a reliable UDP protocol to experiment with the effects of different reliability methods for wireless communications. Goal is to develop adaptive protocols to improve performance or other quality of
service metrics for wireless communications. Contact Zhongtang Cai (ztcai@cc) for more information. This work could take place at user- and/or at kernel-level.

* Experiment with kernel-level task scheduling, including replacing the current Linux task scheduler and/or improving it. Contact Ripal Nathuji (rnathuji@cc) for more information.

* Port efficient binary communication software to the new XScale architecture. This project includes runtime binary code generation for xscale processors. Contact Greg Eisenhauer (eisen@cc) for more information.

Projects that develop certain applications:

* Stub generation: develop compiler support for alternative layout/organization/marshaling and unmarshaling of binary data (based on XML-level data descriptions), thus affecting performance and predictability of data movement via communication links. Contact Jay Lofstead (lofstead@cc) for more information.

* Develop sample distributed applications. Most interesting are: (1) ubiquitous or embedded applications that involve both real-time sensing and reactions to sensor data and use wearable or portable machines, (2) experimentation with adaptive applications, including video transfers (raw video or MPEG/SPEG encoded) and evaluating their performance (including power consumption) on wireless and embeddded devices, (3) innovative distributed applications including video games, remote robot control, immersive systems, using real-time feeds (e.g., gotten from Atlanta's traffic web pages or from sports actions), or using immersive equipment available in CoC (available equipment includes 3D headsets, video wall, Immersadesk) and large data sets (e.g., earth observational data). Propose to instructors and also contact mwolf@cc for more information.