|
1
|
- Mostafa H. Ammar
- College of Computing
- Georgia Institute of Technology
- Atlanta, GA
|
|
2
|
- My Personal Perspective:
- Networking Researcher and not Simulationist.
- Have written and used discrete event computer simulations for over 25
years
- Involved in COMPASS project at GT for the last 7 years
|
|
3
|
- The use of simulation has been growing in the networking community
- Current shifts in networking research landscape have increased the
importance of simulation as tool for evaluation
- There is a crisis of credibility causing people to question the validity
of simulations
- Why and How to Fix it?
|
|
4
|
- The use of simulation has been growing in the networking community
- Current shifts in networking research landscape have increased the
importance of simulation as tool for evaluation
- There is a crisis of credibility causing people to question the validity
of simulations
- Why and How to Fix it?
|
|
5
|
- A spectrum of approaches
- Mathematical Analysis
- Computer Simulation
- Computer Emulation
- Prototype Testbed
- Real network testing/deployment
|
|
6
|
- In the beginning: A combination of
- Mathematical Analysis
- Small-scale prototypes
- Simulation
- However, simulation was primitive and accessible only to people that had
computers and knew how to program them.
|
|
7
|
- Kleinrock’s thesis (1962) used simulation to validate his Independence
assumption.
|
|
8
|
|
|
9
|
|
|
10
|
- As computing became more accessible more and more people started doing
simulations
- Papers using simulation
- INFOCOM 85: 10% , 92-98: ~ 60%
- SIGCOMM 89 : 4/29, 98: 13/26, 04: 11/30
|
|
11
|
- The use of simulation has been growing in the networking community
- Current shifts in networking research landscape have increased the
importance of simulation as tool for evaluation
- There is a crisis of credibility causing people to question the validity
of simulations
- Why and How to Fix it?
|
|
12
|
- Early efforts dealt with relatively simple phenomenon on small-scale
networks.
- Current research deals with complex phenomenon on large-scale networks
- A long story …
|
|
13
|
- Systems are
- Less tractable mathematically
- Difficult to prototype
- And yet everyone has access to abundant computing
- => Simulation more viable and
often the only evaluation tool available
|
|
14
|
- The use of simulation has been growing in the networking community
- Current shifts in networking research landscape have increased the
importance of simulation as tool for evaluation
- There is a crisis of credibility causing people to question the validity
of simulations
- Why and How to Fix it?
|
|
15
|
- “Some claim that stochastic simulation as a performance evaluation tool
of various dynamic systems, including telecommunication networks, is
misused, and that the spread of this phenomenon is so wide that one can
speak about a deep credibility crisis. It is even claimed that one
cannot rely on the majority of the published results of performance
evaluation studies of dynamic systems based on stochastic simulation.”
- From: Pawlikowski, K., Jeong, H.-D. J., Lee, J.-S. R.: On Credibility of
Simulation Studie of Telecommunication Networks. IEEE Comms., Jan. 2002,
132-139.
|
|
16
|
- “ I favor a stamp : WARNING:
COMPUTER SIMULATION – MAY BE ERRONEOUS and UNVERIFIABLE. Like on
Cigarettes.”
- Michael Crichton in “State of
Fear”
|
|
17
|
|
|
18
|
- A Typical Paper Review
- “This paper should be rejected
because its evaluation section is weak. The simulation (uses
questionable models) and/or (simulates too small a network) and/or (does
not have a valid statistical analysis of the simulation output) and/or …
(your own critique here).”
|
|
19
|
- The use of simulation has been growing in the networking community
- Current shifts in networking research landscape have increased the
importance of simulation as tool for evaluation
- There is a crisis of credibility causing people to question the validity
of simulations
- Why and How to Fix it?
|
|
20
|
- Confusion regarding the role of simulation
- Impossibility of simulating Internet-scale networks
- Difficulty in building realistic models
- Lack of standards for validation and repeatability
|
|
21
|
- Confusion regarding the role of simulation
- Impossibility of simulating Internet-scale networks
- Difficulty in building realistic models
- Lack of standards for validation and repeatability
|
|
22
|
- To validate approximate analysis
- To get/confirm first-order insights into new techniques
- To understand complex interactions among various entities/procedures
- To perform relative evaluation among alternatives
- To answer questions regarding deployability in a real network
|
|
23
|
- Different tools may be needed for different roles
- The burden on accuracy, repeatability and validity is highly dependent
on the role
- It is not always (rarely?) stated up front
|
|
24
|
- Parts and Holes in a Manufacturing Transfer Line
|
|
25
|
- Parts and Holes in a Manufacturing Transfer Line
|
|
26
|
- Simulation has not been able to answer wide-scale deployability
questions
- Perhaps it’s a matter of simulation scale
|
|
27
|
- Confusion regarding the role of simulation
- Impossibility of simulating Internet-scale networks
- Difficulty in building realistic models
- Lack of standards for validation and repeatability
|
|
28
|
- Large-scale network simulation offers
- Verify validity of simulation results on small networks
- Examine issues of scale
- Validate theoretical models for large networks
- But it has been quite challenging to build large-scale simulations
|
|
29
|
- Execution time: T ≈ (NF * PF * HF)
/ PTS
- NF = number of flows
- PF = packets sent per flow
- HF = average hops per flow
- PTS = simulator speed (simulated packets transmissions / sec)
- Ignores lost packets, protocol generated packets (e.g., acks)
|
|
30
|
|
|
31
|
- Build “from scratch” approach:
- Substantial effort to build & validate new models
- Users must learn a new simulator
- SSFNet, Qualnet, Javasim
|
|
32
|
- Sequential: Sun / Solaris
- Ultra-80, UltraSPARC-II 450MHz
- 4GB memory
- Parallel: Intel / RedHat Linux 7.3
- 8-way Pentium-III XEON (2MB L2 cache) SMP
- 550MHz clock speed
- 4GB memory
- 17 SMPs (136 CPUs) connectd via Gigabit Ethernet
- Performance measurements are conservative (due to hardware performance)
|
|
33
|
|
|
34
|
|
|
35
|
|
|
36
|
- 147K PTS on one CPU
- Campus network topology, FTP traffic (500 packets/flow, TCP)
- Scale problem size & number CPUs (up to ~4 million network nodes)
- Performance up to 106 Million PTS
|
|
37
|
- A “back-of-the-envelope” calculation
- 100 million Internet hosts
- 1 router for every 100 and each router has 4 links
- 50% of end-hosts have 56Kbps access and 50% have 10Mbps access
- Router to router links are as follows: 50% @ 10Mbps, 40% @ 100Mbps, 5%
@ 655Mbps and 5% @ 2.4Gbps
- Utilization is 50% for access links and 10% for network links
- 1% of hosts have active connections
- Average packet size = 5000 bits
- George Riley, Mostafa Ammar, "Simulating Large Networks: How Big is
Big Enough?" Proceedings of First International Conference on Grand
Challenges for Modeling and Simulation, January 2002.
|
|
38
|
- 2.9 x 10^11 events per second
- Assume can process 10^6 events per second (~ 500,000 PTS)
- => 290,000 CPU seconds (4
days) for evey second of Internet time !!!!
- => need 300 Terabytes of memory in ns – not including routing table
space!!!
- => need 14 Terabytes for event logging for each second of simulation time!!!
- Requires 1000 parallel CPUs with 300 GB of main memory and 1.4 TB of
disk storage in each!!!
- Would not speed things up much – simply allows simulation to run
|
|
39
|
- Possibly … but the network itself is also growing.
- Even with Moore’s Law increase in processing power we will need 300x10^6
CPU seconds for every wallclock second (assuming typical Internet
growth).
- Open Question: What is the right simulation size to explore
Internet-scale performance issues?
|
|
40
|
- Tools & Parallel Simulation Issues
- Robust performance
- Making parallel simulation more transparent, “automatic” (BenchMap and AutoPart)
- Access to HPC platforms
- Visualization Tools
- Modeling issues [Floyd/Paxson]
- Building credible large-scale models and scenarios
- Verifying and validating large-scale simulations
- Methodologies and tools to effectively utilize the simulators
|
|
41
|
- Confusion regarding the role of simulation
- Impossibility of simulating Internet-scale networks
- Difficulty in building realistic models
- Lack of standards for validation and repeatability
|
|
42
|
- The Simulation Modeler’s Dilemma:
- One needs to eliminate “unimportant” details in the simulation in order
to speed up simulation (avoid kitchen-sink simulations)
- But how can one tell if a detail is unimportant
- Simulate and see if there is any difference – this is considered wasted
effort –
- Perhaps we should encourage these kinds of results!
|
|
43
|
- access bandwidth affects throughput significantly
- Models which do not capture packet-level details do not reveal the
difference
|
|
44
|
- A significant challenge especially for large-scale simulation
- Significant attention to topology modeling but very little understanding
of other important issues
- Workload Modeling
- Cross-layer interactions (particularly for wireless networks)
- Modeling of operations and overheads
|
|
45
|
- A perfect instance of the Modeler’s Dilemma
- Split-stack composition may be helpful
|
|
46
|
- Each simulator simulates a portion of the protocol stack of the entire
network
|
|
47
|
- Protocol stack split between TCP and IP
|
|
48
|
- See our work presented in this conference about generating TCP workloads
to match observed network utilization.
- Qi He, Constantinos Dovrolis,
Mostafa Ammar, "A Methodology for the Optimal Configuration of TCP
Traffic in Network Simulation under Link Load Constraints,"
Proceedings of the 38th Annual Simulation Symposium, San Diego, April
2005.
|
|
49
|
- Confusion regarding the role of simulation
- Impossibility of simulating Internet-scale networks
- Difficulty in building realistic models
- Lack of standards for validation and repeatability
|
|
50
|
- The issue:
- Given that the simulation model
is correct, how can one trust the results from the simulation
- Two types of problems
|
|
51
|
- Code Trustworthiness
- Open Source and Reusable Code is a big imporvement
- Good Experimental Design
- Random Number Generation
- Correct Statistical Inference
|
|
52
|
- Publication of enough details to allow repeatability – possibly even
code
- Allowance for Scholarly Credit for repeating experiments
|
|
53
|
- Be open within the community about this issue
- Provide acceptable guidelines for reporting simulation results –
- A Checklist
- Enough details for repeatability
- Stronger enforcement of guidelines
- Change reviewing process (perhaps only for journals)
- Give Scholarly credit for repeating other experiments
|