Q-Fabric - System Support for Continuous Online Quality Management
Advanced distributed applications like grid services and
enterprise computing are becoming increasingly complex. In particular,
dynamic services like distributed multimedia and remote collaboration require
support for sophisticated quality management in order to provide
end-users with the quality of service (QoS) they require. This is a challenging
problem, not only because of the scale of distributed applications and
systems, but also because QoS support needs to be specific to the requirements
of individual end-users. Moreover, quality management is a continuous process
that involves (1) monitoring and controlling the use of resources, (2)
reacting to dynamic changes in user requirements and resource availabilities,
or to system anomalies, and (3) utilizing resources efficiently while ensuring
that system-wide requirements are met. Achieving these goals requires that
end systems cooperate in the monitoring and the control of the resources
and that applications and quality management interact to coordinate the
adaptation of both system resource allocations and application behavior.
Q-Fabric is a collection of communication and computation
services that jointly permit the continuous management of a distributed
system's resources and of the applications using them. Its design is based
on several key principles: (a) a full in-kernel implementation for high
performance and fine-grained and unrestricted access to all system resources;
(b) a flexible and lightweight cross-domain communication mechanism that
links the kernel- with the application-level components being managed;
(c) an event-based kernel-to-kernel communication structure able to dynamically
couple distributed quality management components; and (d) a customization
and extension interface that includes mechanisms to dynamically generate
and relocate quality management components, allowing users to extend or
customize the functionality of Q-Fabric-based quality management to meet
their specific needs. The Q-Fabric and sample management components for
multi-machine and multi-resource quality management have been implemented
and evaluated with standard Redhat Linux distributions on both server and
end systems, including multimedia applications such as videoconferencing,
managed using user-centric and system-level (e.g., energy) quality metrics.
DESIGN AND ARCHITECTURE
Q-Fabric consists of a number of cooperating OS-level components as
shown in the following figure. The first image shows the components
together with the communication paths among the components, the second
image shows the components that can be extended or customized via the Extension
and Customization Interface of Q-Fabric. Customizability is a major concern
of our work, because there is no single optimal solution in quality management
that can be used for all applications on all architectures. That is, a
video conferencing application will require a quality management approach
that ensures that video streams are transmitted at a constant rate with
minimal jitter and best achievable image quality, whereas a clustered server
has to ensure minimal response times to service requests and high throughput
by balancing the requests evenly among all servers. On the other hand,
an application if executed on a desktop has different constraints than
if executed on a mobile device such as a handheld or cell phone, e.g.,
mobile devices often have no disk space available and have to consider
The components of Q-Fabric are:
Resource Monitors: each of those monitors one resource or one attribute
of a resource, eg., the average run queue length for each processor, the
round-trip times of communication on a socket connection, or the retransmission
rate on a TCP connection.
Q-Monitor: the task of this component is to collect monitoring information
from the resource monitors, bring this information in a form usable to
the other components of the system and export it to local and remote applications
and controller mechanisms.
Resource Controllers: each application has one or more application-specific
controller functions, which adapt the resource allocations in a given system.
Q-Controller: the Q-Controller's task is to evaluate the information received
from local and remote Q-Monitors and applications and to execute resource
controllers if required.
Q-API: an application communicates with the system components via the Q-API
of Q-Fabric, which includes extensions to the /proc file system, modified
system calls, signals, locked memory regions, or novel approaches such
as 'aggregated' or 'deferred' system calls.
Q-Communicator: monitoring information from Q-Monitor, control information
from Q-Controller and information from applications are exchanged between
operating system kernels via the Q-Communicator, which is a distributed
event service based on KECho.
Extension and Customization Interface: this interface allows applications
to extend and customize the functionality of the previously described components,
both locally and remotely. This interface allows applications to 'download'
custom code into local and remote operating system kernels, where this
code is translated into architecture-specific binary code and inserted
at the appropriate location in the framework. As examples, using this interface,
an application developer can deploy new resource monitors or controllers
or it can customize the communication sent across the Q-Communicator.
Christian Poellabauer, Karsten Schwan, Sandip Agarwala,
Ada Gavrilovska, Greg Eisenhauer, Santosh Pande, Calton Pu, and Matthew
Wolf, "Service Morphing: Integrated System- and Application-Level Service
Adaptation in Autonomic Systems", Proceedings of the 5th Annual
International Workshop on Active Middleware Services (AMS 2003), Seattle,
Washington, June 2003.
Sandip Agarwala, Christian Poellabauer, Jiantao
Kong, Karsten Schwan, and Matthew Wolf, "Resource-Aware Stream Management
with the Customizable dproc Distributed Monitoring Mechanisms", Proceedings
of the 12th IEEE International Symposium on High Performance Distributed
Computing (HPDC-12)>, Seattle, Washington, June 2003.
Karsten Schwan, Christian Poellabauer, Greg Eisenhauer,
Santosh Pande, and Calton Pu, "InfoFabric: Adaptive Services in Distributed
Embedded Systems", Proceedings of the IEEE Workshop on Large Scale
Real-Time and Embedded Systems (in conjunction with RTSS 2002), Austin,
TX, December 2002.
Hasan Abbasi, Christian Poellabauer, Gregory Losik,
Karsten Schwan, and Richard West, "A Quality-of-Service Enhanced Socket
API in GNU/Linux", Proceedings of the 4th Real-Time Linux Workshop>,
Boston, Massachusetts, December 2002.
Christian Poellabauer, Hasan Abbasi, and Karsten
Schwan, "Cooperative Run-time Management of Adaptive Applications and
Distributed Resources", Proceedings of the 10th ACM Multimedia Conference,
Juan-les-Pins, France, December 2002.
Christian Poellabauer and Karsten Schwan, "Power-Aware
Video Decoding using Real-Time Event Handlers", Proceedings of the
5th International Workshop on Wireless Mobile Multimedia (WoWMoM)>,
Atlanta, Georgia, September 2002.
Christian Poellabauer and Karsten Schwan, "Kernel
Support for the Event-based Cooperation of Distributed Resource Managers",
of the 8th IEEE Real-Time and Embedded Technology and Applications Symposium
(RTAS 2002), San Jose, California, September 2002.
Jasmina Jancic, Christian Poellabauer, Karsten Schwan,
Matthew Wolf, and Neil Bright, "dproc - Extensible Run-Time Resource
Monitoring for Cluster Applications", Proc. of the Intl. Conference
on Computational Science (ICCS '02), Amsterdam, The Netherlands, April
Christian Poellabauer, Karsten Schwan, Greg Eisenhauer,
and Jiantao Kong, "KECho - Event Communication for Distributed Kernel
Services", Proc. of the Intl. Conference on Architecture of Computing
Systems (ARCS'02), Karlsruhe, Germany, April 2002.
Christian Poellabauer, Karsten Schwan, and Richard
West, "Coordinated CPU and Event Scheduling for Distributed Multimedia
Applications", Proc. of the 9th ACM Multimedia Conference, Ottawa,
Canada, October 2001.
Christian Poellabauer, Karsten Schwan, and Richard
West, "Lightweight Kernel/User Communication for Real-Time and Multimedia
Applications", Proc. of the 11th International Workshop on Network
and Operating Systems Support for Digital Audio and Video (NOSSDAV 2001),
Port Jefferson, NY, June 2001.
Christian Poellabauer, Karsten Schwan, Richard West,
Ivan Ganev, Neil Bright, and Gregory Losik, "Flexible User/Kernel Communication
for Real-Time Applications in ELinux", Proc. of the Workshop on Real
Time Operating Systems and Applications and Second Real Time Linux Workshop
(in conjunction with RTSS 2000), Orlando, FL, November 2000.
Q-Fabric is under development, however, if you are interested
in playing with it or even extending it, here is the code. Feedback
is always welcome!