milliScope: a Fine-Grained Monitoring Framework for Performance Debugging of n-Tier Web Services
Chien-An Lai, Josh Kimball, Tao Zhu, Qingyang Wang and Calton Pu
Georgia Institute of Technology, Georgia Institute of Technology, Georgia Institute of Technology, Louisiana State University, Georgia Institute of Technology

Modern distributed systems are often considered to be black-boxes that greatly limit the potential to understand behaviors at the level of detail necessary to diagnose some of the most important types of performance problems. Recently researchers have found abnormal response time delay, one to two order of magnitude longer time than the average response time, exists in short period and causes economical loss for service providers. These very short bottlenecks are hard to detect due to its short live span and its variety of possible reasons. In this paper, we propose milliScope (mScope), the first millisecond-granularity software-based resource and event monitoring for distributed systems that achieves both performance, low overhead at high frequency, and high accuracy matched with other firmware monitoring tool. More specifically, milliScope is a fine-grained monitoring framework to collaborate multiple mScopeMonitors for event and resource monitoring to reconstruct the flow of each client request and profile execution performance in a distributed system. We utilize the resource mScopeMonitors for system resource monitoring, and we develop our own event mScopeMonitors to identify the execution boundary in a lightweight, precise and systematic methodology. The semantic and syntactic of these monitoring logs with arbitrary formats are enriched by our multi-stage data transformation tool, mScopeDataTransformer, which unifies the diverse monitoring logs into a dynamic data warehouse, mScopeDB, for advanced analysis. We conduct several illustrative scenarios in which milliScope successfully diagnoses the response time anomalies caused by very short bottlenecks using a representative web application benchmark (RUBBoS). Besides, we validate the accuracy of our event mScopeMonitors and demonstrate availability and flexibility of milliScope through several evaluations.