Overview

The next two topics will talk about my current research as a PhD student and past research as a Master's student at CMU. As of now I am working on the general idea of preparing systems software for the invasion by heterogeneous many-cores! My internships at Intel Labs, HP Labs and IBM Research have been instrumental for my deep involvement in this work as has been the ongoing effort at Georgia Tech under my advisor Prof. Karsten Schwan. Hybrid Virtual Machines is an umbrella research effort at Georgia Tech and the components described under it are the portions that I have worked on or am contributing to.

HyVM: Hybrid Virtual Machines

A challenge posed by future computer architectures is the efficient exploitation of their many and often, diverse computational cores, with examples ranging from graphics processors, to IBM.s Cell processor, to I/O-centric accelerators sharing chip space with general purpose computational cores like the Cell's PowerPC cores. This challenge is exacerbated by the diverse facilities for data movement and data sharing across cores resident on such platforms, which range from cache-level methods for data sharing, to non-coherent shared memory with DMA support, to PCI-based connectivity for I/O. The key technical challenge to be addressed by our work is one that deals with the potentially serious mismatch of computational vs. memory or I/O bandwidths present on future platforms. Toward that end, we propose to pursue a course of study that will:

GViM: GPU-accelerated Virtual Machines

The use of virtualization to abstract underlying hardware can aid in sharing such resources and in efficiently managing their use by high performance applications. Unfortunately, virtualization also prevents efficient access to accelerators, such as Graphics Processing Units (GPUs), that have become critical components in the design and architecture of HPC systems. Supporting General Purpose computing on GPUs (GPGPU) with accelerators from different vendors presents significant challenges due to proprietary programming models, heterogeneity, and the need to share accelerator resources between different Virtual Machines (VMs).

To address this problem, this paper presents GViM, a system designed for virtualizing and managing the resources of a general purpose system accelerated by graphics processors. Using the NVIDIA GPU as an example, we discuss how such accelerators can be virtualized without additional hardware support and describe the basic extensions needed ufor resource management. Our evaluation with a Xen-based implementation of GViM demonstrate efficiency and flexibility in system usage coupled with only small performance penalties for the virtualized vs. non-virtualized solutions.

Efforts are currently under way to enhance the GViM infrastructure to handle asynchronous CUDA calls and increase the number of benchmarks that can be supported. At some point in the future, we will integrate with a system like VMGL to provide OpenGL support to applications requiring display capabilities.

Other people involved in the effort are/have been Ada Gavrilovska, Harshvardhan Kharche (now at Intel), Benjamin Heiskell from Georgia Tech; Niraj Tolia, Vanish Talwar and Partha Ranganathan from HP Labs.

Cellule: Lightweight Virtualization of Accelerators

Initial steps in this research have focused on the efficient use of accelerators, using IBM's Cell BE processor as the key platform addressed by this work. Here, experiences with running the Linux operating system on the Power core of the Cell processor have shown that this core is less efficient than the general purpose cores in hosting a full fledged operating system. In part, this is because the Power core was principally designed to be a `service processor' responsible for coordinating the Cell.s SPEs. The first challenge faced by our research, then, has been to make efficient use of this service processor in order to exploit Cell as a remote accelerator utilized by one or more general purpose machines, with hardware configurations like those in the Roadrunner project. More generally, this research is investigating the opportunities presented by combining the concepts of virtualization and accelerators to simplify the Cell execution model, to enable its effective utilization by the applications running on the general purpose machines.

The first technical outcome of the proposed research is the "Cellule" execution environment. This is a virtualized Cell B.E based system which hosts a small high performance execution environment, called the Special Execution Environment (SEE), on the hypervisor to run SPE applications. To realize this environment, we have ported IBM's research hypervisor (rHype) to work on the Cell board, created wrappers for libSPE, which is the standard interface used by the Cell applications and facilitated the creation of SEE to run the application. The SEE can be compared to a real time OS environment that has exactly the elements necessary for any libSPE based application to run. Initial experimental results have shown that the Cellule environment offers at least as good as or better performance than Linux, which has encouraged our endeavors in this direction.

Other people involved in the effort are Jimi Xenidis and Dilma Da Silva from IBM Austin Research Lab and IBM TJ Watson Research Center.

Montage: Scheduling and Resource Management in Heterogeneous Many-core Systems

Details to follow.

Other people involved in the effort are Niraj Tolia, Vanish Talwar and Partha Ranganathan from HP Labs; Rob Knauerhase, Paul Brett and Scott Hahn from Intel Labs.

Area Driven Pervasive Computing Applications

The past few years have witnessed exponential growth in the number of handhelds. Along with the steep rise in the number of handhelds there has also been a tremendous increase in the number of applications and services that have been developed keeping in mind the constraints and flexibilities offered by the mobile devices. A common characteristic that all these applications and services share is that the user has to initiate some action in order to use the results of the application or service. There are hardly any applications that proactively initiate communication with the user. The applications considered in this thesis diverge from these traditional applications and services since the environment is made smart. The idea is to allow the application to specify the areas where it would want to track users and take suitable action in case a user is detected in that area. While much research has focused on developing services architectures for location-aware systems, less attention has been paid to the fundamental and challenging problem of providing capability to an application in defining physical areas where it would want tracking of mobile users, especially in in-building environments. The goal is to be able to determine with high probability when the user is in the area of interest to the application. In this paper we present an infrastructure for area-driven applications enabling them to specify area-based user tracking requirements and achieve the intended purpose. We make use of the existing infrastructure of wireless access points to determine the area where a user is. We also study the improvement that can be brought about in the granularity of an area with the use of history information.

Here is the final presentation.