Notes
Slide Show
Outline
1
Why We STILL Don’t Know How To Simulate Networks
  • Mostafa H. Ammar


  • College of Computing
  • Georgia Institute of Technology
  • Atlanta, GA
2
Disclaimer
  • My Personal Perspective:
    • Networking Researcher and not Simulationist.
    • Have written and used discrete event computer simulations for over 25 years
    • Involved in COMPASS project at GT for the last 7 years


3
The Main Message
  • The use of simulation has been growing in the networking community
  • Current shifts in networking research landscape have increased the importance of simulation as tool for evaluation
  • There is a crisis of credibility causing people to question the validity of simulations
  • Why and How to Fix it?
4
The Main Message
  • The use of simulation has been growing in the networking community
  • Current shifts in networking research landscape have increased the importance of simulation as tool for evaluation
  • There is a crisis of credibility causing people to question the validity of simulations
  • Why and How to Fix it?
5
Evaluating Networks: A Spectrum
  • A spectrum of approaches


    • Mathematical Analysis
    • Computer Simulation
    • Computer Emulation
    • Prototype Testbed
    • Real network testing/deployment
6
A Brief History of Network Simulation
  • In the beginning: A combination of
    •    Mathematical Analysis
    •    Small-scale prototypes
    •    Simulation
  • However, simulation was primitive and accessible only to people that had computers and knew how to program them.
7
Early Examples of Network Simulation
  • Kleinrock’s thesis (1962) used simulation to validate his Independence assumption.
8
Early Examples of Network Simulation
9
Early Examples of Network Simulation
10
The Rise of Network Simulation
  • As computing became more accessible more and more people started doing simulations
  • Papers using simulation
    • INFOCOM 85: 10% , 92-98:  ~ 60%
    • SIGCOMM 89 : 4/29, 98: 13/26, 04: 11/30


11
The Main Message
  • The use of simulation has been growing in the networking community
  • Current shifts in networking research landscape have increased the importance of simulation as tool for evaluation
  • There is a crisis of credibility causing people to question the validity of simulations
  • Why and How to Fix it?
12
Networking Research Landscape
  • Early efforts dealt with relatively simple phenomenon on small-scale networks.
  • Current research deals with complex phenomenon on large-scale networks
  • A long story …
13
Network Research Landscape
  • Systems are
    • Less tractable mathematically
    • Difficult to prototype
    • And yet everyone has access to abundant computing
  • =>  Simulation more viable and often the only evaluation tool available
14
The Main Message
  • The use of simulation has been growing in the networking community
  • Current shifts in networking research landscape have increased the importance of simulation as tool for evaluation
  • There is a crisis of credibility causing people to question the validity of simulations
  • Why and How to Fix it?
15
Crisis of Credibility
  • “Some claim that stochastic simulation as a performance evaluation tool of various dynamic systems, including telecommunication networks, is misused, and that the spread of this phenomenon is so wide that one can speak about a deep credibility crisis. It is even claimed that one cannot rely on the majority of the published results of performance evaluation studies of dynamic systems based on stochastic simulation.”


  • From: Pawlikowski, K., Jeong, H.-D. J., Lee, J.-S. R.: On Credibility of Simulation Studie of Telecommunication Networks. IEEE Comms., Jan. 2002, 132-139.



16
Crisis of Credibility

  •  “ I favor a stamp : WARNING: COMPUTER SIMULATION – MAY BE ERRONEOUS and UNVERIFIABLE. Like on Cigarettes.”


  •   Michael Crichton in “State of Fear”


17
Crisis of Credibility
18
Crisis of Credibility
  • A Typical Paper Review
  •  “This paper should be rejected because its evaluation section is weak. The simulation (uses questionable models) and/or (simulates too small a network) and/or (does not have a valid statistical analysis of the simulation output) and/or … (your own critique here).”
19
The Main Message
  • The use of simulation has been growing in the networking community
  • Current shifts in networking research landscape have increased the importance of simulation as tool for evaluation
  • There is a crisis of credibility causing people to question the validity of simulations
  • Why and How to Fix it?
20
Reasons for the Credibility Crisis
  • Confusion regarding the role of simulation
  • Impossibility of simulating Internet-scale networks
  • Difficulty in building realistic models
  • Lack of standards for validation and repeatability


21
Reasons for the Credibility Crisis
  • Confusion regarding the role of simulation
  • Impossibility of simulating Internet-scale networks
  • Difficulty in building realistic models
  • Lack of standards for validation and repeatability


22
The Roles of Simulation
  • To validate approximate analysis
  • To get/confirm first-order insights into new techniques
  • To understand complex interactions among various entities/procedures
  • To perform relative evaluation among alternatives
  • To answer questions regarding deployability in a real network
23
The Roles of Simulation
  • Different tools may be needed for different roles
  • The burden on accuracy, repeatability and validity is highly dependent on the role
  • It is not always (rarely?) stated up front
24
A Personal Experience
  • Parts and Holes in a Manufacturing Transfer Line
25
A Personal Experience
  • Parts and Holes in a Manufacturing Transfer Line
26
A Significant Failure
  • Simulation has not been able to answer wide-scale deployability questions
    • Multicast
    • QoS
    • RED
    • …
  • Perhaps it’s a matter of simulation scale
27
Reasons for the Credibility Crisis
  • Confusion regarding the role of simulation
  • Impossibility of simulating Internet-scale networks
  • Difficulty in building realistic models
  • Lack of standards for validation and repeatability


28
Large-Scale Network Simulation
  • Large-scale network simulation offers
    • Verify validity of simulation results on small networks
    • Examine issues of scale
    • Validate theoretical models for large networks
  • But it has been quite challenging to build large-scale simulations


29
Quantifying Simulator Performance
  • Execution time: T ≈ (NF * PF * HF) / PTS
    • NF = number of flows
    • PF = packets sent per flow
    • HF = average hops per flow
    • PTS = simulator speed (simulated packets transmissions / sec)
    • Ignores lost packets, protocol generated packets (e.g., acks)
30
Scalability of Packet Level Simulators
31
Approaches to Parallel Network Simulation
  • Build “from scratch” approach:
  • Substantial effort to build & validate new models
  • Users must learn a new simulator
  • SSFNet, Qualnet, Javasim
32
Hardware Platforms
  • Sequential: Sun / Solaris
    • Ultra-80, UltraSPARC-II 450MHz
    • 4GB memory
  • Parallel: Intel / RedHat Linux 7.3
    • 8-way Pentium-III XEON (2MB L2 cache) SMP
    • 550MHz clock speed
    • 4GB memory
    • 17 SMPs (136 CPUs) connectd via Gigabit Ethernet
  • Performance measurements are conservative (due to hardware performance)
33
Sequential Performance Comparison   (Single Campus Network – ~ 500 nodes and links)
34
PDNS Performance on Cluster
(Perumalla/Park)
35
Lemieux Supercomputer
36
PDNS Performance on PSC
(Perumalla)
  • 147K PTS on one CPU
  • Campus network topology, FTP traffic (500 packets/flow, TCP)
  • Scale problem size & number CPUs (up to ~4 million network nodes)
  • Performance up to 106 Million PTS
37
But… Can we build an Internet-scale Simulation?
  • A “back-of-the-envelope” calculation
    • 100 million Internet hosts
    • 1 router for every 100 and each router has 4 links
    • 50% of end-hosts have 56Kbps access and 50% have 10Mbps access
    • Router to router links are as follows: 50% @ 10Mbps, 40% @ 100Mbps, 5% @ 655Mbps and 5% @ 2.4Gbps
    • Utilization is 50% for access links and 10% for network links
    • 1% of hosts have active connections
    • Average packet size =  5000 bits
  • George Riley, Mostafa Ammar, "Simulating Large Networks: How Big is Big Enough?" Proceedings of First International Conference on Grand Challenges for Modeling and Simulation, January 2002.
38
Back of the Envelope Calculation (cont’d)
  •  2.9 x 10^11 events per second
  • Assume can process 10^6 events per second (~ 500,000 PTS)
  • => 290,000 CPU seconds  (4 days) for evey second of Internet time !!!!
  • => need 300 Terabytes of memory in ns – not including routing table space!!!
  • => need 14 Terabytes for event logging  for each second of simulation time!!!
  • Requires 1000 parallel CPUs with 300 GB of main memory and 1.4 TB of disk storage in each!!!
  • Would not speed things up much – simply allows simulation to run


39
Wait a few years and computing power will catch up
  • Possibly … but the network itself is also growing.
  • Even with Moore’s Law increase in processing power we will need 300x10^6 CPU seconds for every wallclock second (assuming typical Internet growth).
  • Open Question: What is the right simulation size to explore Internet-scale performance issues?
40
Many Challenges Remain
  • Tools & Parallel Simulation Issues
    • Robust performance
    • Making parallel simulation more transparent, “automatic”   (BenchMap and AutoPart)
    • Access to HPC platforms
    • Visualization Tools
  • Modeling issues [Floyd/Paxson]
    • Building credible large-scale models and scenarios
    • Verifying and validating large-scale simulations
      • Topology?  Traffic?
    • Methodologies and tools to effectively utilize the simulators
41
Reasons for the Credibility Crisis
  • Confusion regarding the role of simulation
  • Impossibility of simulating Internet-scale networks
  • Difficulty in building realistic models
  • Lack of standards for validation and repeatability
42
Building Realistic Models
  • The Simulation Modeler’s Dilemma:
    • One needs to eliminate “unimportant” details in the simulation in order to speed up simulation (avoid kitchen-sink simulations)
    • But how can one tell if a detail is unimportant
    • Simulate and see if there is any difference – this is considered wasted effort –
    • Perhaps we should encourage these kinds of results!
43
Incorporating Packet-Level Details in P2P Simulations
  • access bandwidth affects throughput significantly
  • Models which do not capture packet-level details do not reveal the difference
44
Building Realistic Models
  • A significant challenge especially for large-scale simulation
  • Significant attention to topology modeling but very little understanding of other important issues
    • Workload Modeling
    • Cross-layer interactions (particularly for wireless networks)
    • Modeling of operations and overheads
45
Cross-layer modeling
  • A perfect instance of the Modeler’s Dilemma
  • Split-stack composition may be helpful
46
Simulation Split Vertically
  • Each simulator simulates a portion of the protocol stack of the entire network
47
Splitting Protocol Stack
  • Protocol stack split between TCP and IP
48
Workload Modeling
  • See our work presented in this conference about generating TCP workloads to match observed network utilization.
  •  Qi He, Constantinos Dovrolis, Mostafa Ammar, "A Methodology for the Optimal Configuration of TCP Traffic in Network Simulation under Link Load Constraints," Proceedings of the 38th Annual Simulation Symposium, San Diego, April 2005.


49
Reasons for the Credibility Crisis
  • Confusion regarding the role of simulation
  • Impossibility of simulating Internet-scale networks
  • Difficulty in building realistic models
  • Lack of standards for validation and repeatability


50
Simulation Validation and Repeatability
  • The issue:
  •    Given that the simulation model is correct, how can one trust the results from the simulation
  • Two types of problems
    • Technical
    • Social
51
Technical Issues
  • Code Trustworthiness
    • Open Source and Reusable Code is a big imporvement
    • Good Experimental Design
    • Random Number Generation
    • Correct Statistical Inference
52
Social Issues
  • Publication of enough details to allow repeatability – possibly even code
  • Allowance for Scholarly Credit for repeating experiments
53
Final Thoughts
  • Be open within the community about this issue
  • Provide acceptable guidelines for reporting simulation results –
    • A Checklist
    • Enough details for repeatability
  • Stronger enforcement of guidelines
  • Change reviewing process (perhaps only for journals)
  • Give Scholarly credit for repeating other experiments