PDNS - Parallel/Distributed NS

Please subscribe to the PDNS mailing list for discussion on PDNS and related announcements.

Status and Changes: pdns-2.27-v1a (March 16, 2004) [ download :: API ]
   1. Compatibility with ns-2.27, lastest libSynk, and Intel's icc/ecc and gcc-3.2 compilers
   2. GTemulator support (Linux only)
   3. Local -> Remote -> Local routing support, other routing issues fixed
   4. Improved compatibility with 64-bit platforms

Overview

The publicly available network simulator ns has become a popular and widely used simulator for research in telecommunications networks. However, the design of ns is such that simulation of very large networks is difficult, if not impossible, due to excessive memory and CPU time requirements. The PADS research group at Georgia Tech has developed extensions and enhancements to the ns simulator to allow a network simulation to be run in a parallel and distributed fashion, on a network of workstations.

Objectives

We set out to provide a means for ns users to distribute their simulation on several (e.g. 8-16) workstations connected either via a Myrinet network, or a standard Ethernet network using the TCP/IP protocol stack. By distributing the network model on several machines, the memory requirements on any single system can be substantially smaller than the memory used in a single-workstation simulation. The overall execution time of the simulation should be at least as fast at the original single-workstation simulation, and can be several times faster. In all cases we can support proportionally larger network models by distributing the model on multiple systems.

A key goal was to minimize the number of modifications required to the released ns source code and to insure that all existing ns simulations would still run properly when used with our modified ns. Minimizing the number of changes to ns allows the parallel simulator to readily take advantage of new, improved versions of ns as they become available. Any new or revised ns syntax should be directly related to the parallelization of the simulation, and does not affect ns users who are not using PDNS.

Approach

In order to achieve the goal of limited modifications to the base ns software, we chose to use a federated simulation approach where separate instantiations of ns modeling different subnetworks execute on different processors. PDNS uses a conservative (blocking based) approach to synchronization. No federate in the parallel simulation will ever process an event that would later have to be undone due to receiving messages in the simulated past. This avoids the need to implement state saving in the existing ns code.

The PADS research group at Georgia Tech has previously developed an extensive library of support software for implementing parallel and distributed simulations (see libSynk and RTIKIT). The sofware has support for global virtual time management, group data communications, and message buffer management. It has support for a variety of communication interconnects, including shared memory, Myrinet and TCP/IP networks, and runs on a variety of platforms. By using this synchronization software for the parallelization of ns, we were able to rapidly modify the main event processing loop of ns to support the distributed time management functions needed to insure that no unsafe event is ever processed by any federate.

The modifications needed to ns can be broadly classified in two major categories, the modifications to the ns event processing infrastructure, and extensions to the ns TCL script syntax for describing simulations. Each of those categories are described in detail below.

Modifications to ns event processing

The standard ns release has several variants of the main event processing loop which can be specified by the ns user as follows:

$ns use-scheduler Heap

which specifies that the heap based scheduler should be used. We developed a new event scheduler known as Scheduler/RTI, which is specified by the ns user as follows:

$ns use-scheduler RTI

The Scheduler/RTI uses the time management functions of the libSynk/RTIKIT to insure that local simulation time advances do not allow for the processing of unsafe events. The Scheduler/RTI also contains code to process events received by a federate which were generated by another federate, and places those new events in the proper location in the event list.

Also related to event scheduling is the transmission of events from one federate to another in the distributed simulation. This is handled by a newly created ns agent called Agent/RTI. The Agent/RTI is responsible for determining that a received event is destined for a remote federate, preparing a message containing the complete event information, and forwarding that to the remote federate using the RTI MCAST functions. The RTI Agents are automatically inserted on any ns node that has a link to a remote simulator.

Modifications to ns TCL syntax

In addition to the event scheduling modifications mentioned above, the way that a network topology and network data flows are defined by ns need to be enhanced to allow a federated ns simulation. Consider the simple topology shown below. If this simple eight node simulation were run on a single ns, then nodes R0 and R2 and their connecting link are simply defined as:
simple node diagram

set r0 [$ns node]
set r2 [$ns node]
$ns duplex-link $r0 $r2 1.5mb 10ms DropTail

But when we decide to run the simulation in a distributed fashion on Simulators A and B as shown below, the definition of the duplex-link is problematic. In simulator A there is no notion of node r2, and in simulator B there is no notion of node r0. We solve this problem by extending the ns syntax to include the specification of a remote link, called an rlink. With our extended ns syntax, the simulated links which are across federates are defined using an rlink command, just specifying the local endpoint, and identifying it with an IP Address. The other end of the simulated link is defined in simulator B, and is also assigned an IP address. At runtime, remote links with matching network addresses are logically connected, and simulated packets which leave simulator A on the rlink are delivered to the corresponding rlink in simulator B. Details on how this is done can be seen in the example script linked below.

Defining data flows in ns is done similarly, defining the two endpoints of the data flow by name. We have the same problem when the remote endpoint is located on another federate, and we solve it in a similar fashion, allowing a remote connection, called an ip-connect. With the ip-connect command, the remote endpoint of a data connection is specified by IP Address and port number rather than by ns node and agent name.

PDNS Interface and Syntax

Assigning IP Addresses to Links

As shown in the examples, when using pdns you must assign an IP address to any node that will be referenced remotely. In keeping with normal networking practice, the IP addresses are actually assigned to the links, not the nodes. The syntax for assigning an IP address is:

$linkname set-ipaddr address mask

Example:

[$ns link $n1 $n2] set-ipaddr 10.1.1.1 255.255.255.0
[$ns link $n2 $n1] set-ipaddr 10.1.1.2 255.255.255.0

Creating Remote Links

When using pdns, a Remote Link is a ns link where one node endpoint is defined on one instance of pdns, and the second node endpoint is on a different instance of pdns. Note that one end of the remote link must have an IP address ending in ".1". For example, if you create a remote link 192.168.1.1, the other end of the remote link on a different PDNS instance must be 192.168.1.2 (assuming the netmask is 255.255.255.0). This is how PDNS matches remote links between two different PDNS instances. The syntax to create a remote link is:

$nodename rlink linkspeed linkdelay qtype ipaddr addrmask

Example:

$n2 rlink 100Mb 5ms DropTail 192.168.1.1 255.255.255.0

Defining Remote Routes

When using pdns, since only a portion of the global topology is known to each simulator, there are times when choosing the correct route between nodes defined on different simulators is problematic. In the sample topology above, router R0 has insufficient information to determine which of the two defined remote links is the shortest path to host H0. We solve this problem by explicitly specifying routes for each remote link, as follows:

$ns add-route $router-node rlink-ip dest-ip dest-mask { src-ip src-mask }

Example:

$ns add-route $n2 192.168.1.1 10.1.2.0 255.255.255.0 10.1.1.0 255.255.255.0
$ns add-route $n3 192.168.1.2 10.1.3.0 255.255.255.0

In the above example, the first line would route data originating from subnet 10.1.1.X through the rlink 192.168.1.1 if the destination IP address is in the subnet 10.1.2.X. The second line would direct PDNS to route all traffic originating from its instance through the rlink 192.168.1.2 if the destination IP is in the subnet 10.1.3.X. As you can see, the source IP and mask are optional; if it is not specified then src-ip and src-mask will both be set to 0.0.0.0.

Note: If you have agents that produce traffic destined for other agents within the same PDNS instance but must travel through remote routes because no local paths exist between the two agents, you must specify this route path. See the "Sample Scripts" section below for a simple script that performs local->remote->local routing.

Binding Port Numbers to Agents

As shown in the examples, when using pdns you must assign a port number to any agent that will be referenced remotely. Both ends of a TCP connection must have a port number. The syntax for assigning a port number is:

$nodename bind $agent portnum

Example:

$n1 bind $tcp_agent 80

Defining Remote and Local Connections

In order for PDNS to correctly send data from one end-host to another, the usual ns "connect" command will not suffice. Instead, the following new directive allows PDNS to correctly route traffic destined for remote simulators to the correct local border routers. However, note that the following command can be used for both remote and local connects.

$ns ip-connect $agent dest-ip port

Example:

$ns ip-connect $tcp_agent 192.168.3.4 80

Creating Dynamic TCP Endpoints

Sometimes it is necessary to have dynamic TCP connections created at runtime. PDNS now includes a TCP Listener Agent which dynamically allocates connections on the fly as the simulator is running, instead of determining all connections at initiliazation. TCP Listener closely models real TCP connections (i.e. multiple TCP connections per destination port, a TCP connection table). Of course, the TCP Listener Agent can accept pre-determined connections at initialization time. The underlying TCP protocol behavior is inherited from tcp-full. Once the three-way handshake is complete, TCP Listener promotes the connection to tcp-full status by instantiating tcp-full agents and connecting the source to it automatically. TCP Listener handles all data de-multiplexing. TCP Listeners must be used with agents that have IP addresses associated with them.

set tcpl [new Agent/TCP/Listener]

Example:

set tcpl [new Agent/TCP/Listener]
$ns attach-agent $n2 $tcpl
$n2 bind $tcpl 80
$tcpl set-application MyAgent/WebServer 0 callback 1 1

You can send TCP data to this agent from any PDNS instance via ip-connect. If you need to attach an application (similar to the WebServer above) to the listener, use the set-application command below. Note that the sending tcp-full agent (initiator of the TCP connection) should set close_on_empty_ true so completed connections can be removed from the connection table (active connection teardown):

$tcp_agent set close_on_empty_ true

Defaults:

Commands:

Accessing rlink Queues

You may need to access a remote link queue (i.e. for statistics), just as any link in ns. The method for obtaining the queue object is:

$ns rqueue $node ipaddr

Example:

set $qo [$ns rqueue $n2 192.168.1.1]

PDNS Simulations

The current version of PDNS has been tested on as many as 136 processors simulating a 600,000+ node network topology. We have run on Myrinet interconnected systems, Ethernet TCP/IP network connected systems, and using shared memory on multiprocessor systems. The overall speedup achieved by running a distributed simulation is dependent on many factors, so it cannot accurately be predicted for any given simulation. In our testing environment, the speedup achieved when running on eight systems varies from about eight (near perfect speedup) to less than one.

Download and Install PDNS

All of the modifications to existing ns code only work with the 2.27 release. You will also need the libSynk libraries, which can be compiled for Intel Linux, Intel Solaris, Sparc Solaris, SGI Irix, HP UX, Macintosh OS X, and Microsoft Windows systems. These instructions assume you have loaded the ns source directory at ~/ns-allinone-2.27, and libSynk at ~/libsynk. If either one is located elsewhere, be sure to modify these instructions accordingly. Please do not download the PDNS source with Internet Explorer!

Source Files

Source Release Date
ns-2.27 January 18, 2004
libSynk December 29, 2003 or later
pdns-2.27-v1a March 16, 2004
pdns-gtemu-2.27-v1a March 16, 2004

Download either pdns-2.27-v1a or pdns-gtemu-2.27-v1a. pdns-2.27-v1a is the most compatible version and can be compiled using Intel's icc/ecc or gcc-3.2 (or previous) series compilers. If you need GTemulator capability, download pdns-gtemu-2.27-v1a, however, please be aware this version is only compatible with Linux systems and gcc.

Building libSynk

Decompress libSynk:

cd $(HOME)
gunzip -c libsynk-current.tar.Z | tar -xvpf -
cd libsynk

If you are using Myrinet, you must modify the appropriate makefiles before compiling. For instance, using Myrinet under Linux:

  1. Edit the file Makefile in ~/libsynk so CFLAGS contains -DGM_AVAILABLE=1 and LDLIBS contains -lgm
  2. Edit the file Makefile in ~/libsynk/fdkcompat so LDLIBS contains -lgm
  3. For improved performance, you may want to specify -O2, -O3, or any other optimization parameters for your architecture.

Now create the libraries:

make
cd fdkcompat
make

Building ns-2 and PDNS

Decompress the baseline ns software:

cd $(HOME)
gunzip -c ns-allinone-2.27.tar.gz | tar -xvf -

Move the PDNS patches (pdns_2.27_v1a.gz) file to the ~/ns-allinone-2.27/ns-2.27 directory and patch the stock ns:

cd $(HOME)/ns-allinone-2.27/ns-2.27
gunzip -c pdns_2.27_v1a.gz | patch -p3

Next, edit the ~/ns-allinone-2.27/ns-2.27/Makefile.in file:

  1. Edit the KITHOME macro on line 64 to reflect the correct directory of libSynk.
  2. If you are using Myrinet, you will need to add -lgm on line 103 of the Makefile.in file.
  3. Message Compression is enabled by default. To disable (not recommended), remove the define -DUSE_COMPRESSION on line 68.
  4. Again, you may want to add an optimization parameters for your architecture into the Makefile under CCOPT.

Install ns normally:

cd $(HOME)/ns-allinone-2.27
./install

The resulting binary for Parallel/Distributed ns will be ~/ns-allinone-2.27/ns-2.27/pdns.

Running PDNS

The pdns software uses libSynk to coordinate the startup synchronization of the distributed simulation. The startup uses two environment variables, NODEINFO (and usually FMTCP_SESSIONNAME for TCP communications) to coordinate the creation of the inter-process communication (sockets, shared memory, etc.). These environment variables are described in detail in the libSynk usage documentation.

Note that you may specify the initial TSO Queue heap size and increments at run time through the environment variables: BRTI_HEAPSIZE and BRTI_HEAPINCR. The default initial heap size is 1000 events and the default increment is 10000 events.

Sample Scripts

test.tcl requires 3 pdns instances. To run, modify run_test.sh to reflect the correct hostname(s) to be used. Note the use of export for bash-style shells, modify appropriately for csh, tcsh, etc. You do not need 3 machines to run 3 pdns instances for this small script. You can place the same hostname 3 times in the NODEINFO string (and the script will ssh 3 times to the same host to spawn 3 pdns instances accordingly). The file out.0 should have 5 dlink hops, 3 rlink hops, 8 total hops. The file out.1 should have 5 dlink hops, 2 rlink hops, 7 total hops. The file out.2 should have 0 dlink hops, 0 rlink hops, 0 total hops. dlink hops stands for duplex-link hops or local hops within a pdns instance, while rlink hops stands for remote-link hops or hops that span two pdns instances. Note that rlink hops invoke libSynk's communication primivites which are one of or a combination of shared memory, TCP/IP, or Myrinet.

lrl.tcl only requires 2 pdns instances. Again, modify run_lrl.sh to your enviornment. The file out.0 should have 10 dlink hops, 5 rlink hops, 15 total hops. The file out.1 should have 0 dlink hops, 5 rlink hops, 5 total hops.

Network Emulation using Veil

This release of pdns also includes parts of emulation support for routing real application traffic via pdns networks (designed and developed by Kalyan Perumalla). This is achieved using the Veil emulation framework, which captures the application data at the socket API layer. The data capture portion is included in the veil overloading library, while the data injection/emitting code is included in pdns as the LiveApplication Agent in rti/liveapp.cc. For additional information, see the Veil homepage.

Contact Information

Dr. George Riley (riley@ece.gatech.edu)
Alfred Park (park@cc.gatech.edu)
College of Computing
Georgia Institute of Technology
Atlanta, GA 30332-0280
Last Modified: March 17, 2004

Valid XHTML 1.0! Valid CSS!  counter