![]() |
| Title |
Network Data Monitoring and Analysis
|
| Speaker |
Qi Zhao
|
| Abstract | |
|
Nowadays the phenomenal growth of the Internet and its applications has far outpaced the development of the techniques performing network measurement and monitoring. Building an efficient and scalable traffic measurement framework for high-speed networks becomes a very challenging problem and attracts a lot of research effort. My dissertation makes some significant contributions to this challenge, which can be classified into three related themes: network data streaming, network data inference and hardware support for high-speed massive data analysis. In this talk, I use traffic matrix estimation as an example to demonstrate our research effort in the first two themes and briefly introduce some new hardware support designed massive data processing. The traffic volume between origin/destination (OD) pairs in a network, known as traffic matrix, is essential for a number of network management and tasks in operational IP networks such as capacity planning and traffic engineering. In the collaboration with AT&T research labs, we design two novel methods to produce better traffic matrix estimation. The first proposes a brand-new data streaming algorithm to directly measure traffic volume for each OD pair. It makes each ingress node and egress node generate traffic digests that are orders of magnitude smaller than the raw traffic stream. By correlating the digests collected at any OD pair, the volume of traffic flowing between the OD pair can be accurately determined. It achieves around one order of magnitude higher estimation accuracy than the network tomography approach (including Tomogravity method) and is multiple times better than the sampling scheme (e.g., NetFlow) given the same amount of data generated. The other method combines the SNMP link counts and sampled NetFlow records together to produce more accurate estimation of traffic matrices even when NetFlow records are available on only a subset of ingress nodes. In this work we also design methods that, by comparing notes between the link counts and flow records, identify and remove dirty data (measurement errors in SNMP and NetFlow due to hardware, software or transmission faults). These proposed methods not only improve the accuracy of traffic matrix estimation, but may also benefit a number of other applications that depend on these data. |
|
|
Bio |
|
|
|
|
