PDNS - Parallel/Distributed NS

Overview

The publicly available network simulator ns has become a popular and widely used simulator for research in telecommunications networks. However, the design of ns is such that simulation of very large networks is difficult, if not impossible, due to excessive memory and CPU time requirements. The PADS research group at Georgia Tech has developed extensions and enhancements to the ns simulator to allow a network simulation to be run in a parallel and distributed fashion, on a network of workstations.

Objectives

We set out to provide a means for ns users to distribute their simulation on several ( e.g., 8 - 16) workstations connected either via a Myrinet network, or a standard Ethernet network using the TCP/IP protocol stack. By distributing the network model on several machines, the memory requirements on any single system can be substantially smaller than the memory used in a single-workstation simulation. The overall execution time of the simulation should be at least as fast at the original single-workstation simulation, but we can support proportionally larger network models by distributing the model on multiple systems.

A key goal was to minimize the number of modifications required to the released ns source code and to insure that all existing ns simulations would still run properly when used with our modified ns. Minimizing the number of changes to ns allows the parallel simulator to readily take advantage of new, improved versions of ns as they become available. Any new or revised ns syntax should be directly related to the parallelization of the simulation, and does not affect ns users who are not using PDNS.

Approach

In order to achieve the goal of limited modifications to the base ns software, we chose to use a federated simulation approach where separate instantiations of ns modeling different subnetworks execute on different processors. PDNS uses a conservative (blocking based) approach to synchronization. No federate in the parallel simulation will ever process an event that would later have to be undone due to receiving messages in the simulated past. This avoids the need to implement state saving in the existing ns code.

The PADS research group at Georgia Tech has previously developed an extensive library of support software for implementing parallel and distributed simulations, known as RTIKIT. The RTIKIT sofware has support for global virtual time management, group data communications, and message buffer management. The RTIKIT software has support for both Myrinet and TCP/IP networks, and runs on a variety of platforms. By using the RTIKIT software for the parallelization of ns, we were able to rapidly modify the main event processing loop of ns to support the distributed time management functions needed to insure that no unsafe event is ever processed by any federate.

The modifications needed to ns can be broadly classified in two major categories, the modifications to the ns event processing infrastructure, and extensions to the ns TCL script syntax for describing simulations. Each of those categories are described in detail below.

Modifications to ns event processing

The standard ns release has several variants of the main event processing loop which can be specified by the ns user as follows:

$ns use-scheduler Heap

which specifies that the heap based scheduler should be used. We developed a new event scheduler known as Scheduler/RTI, which is specified by the ns user as follows:

$ns use-scheduler RTI

The Scheduler/RTI uses the time management functions of the RTIKIT to insure that local simulation time advances do not allow for the processing of unsafe events. The Scheduler/RTI also contains code to process events received by a federate which were generated by another federate, and places those new events in the proper location in the event list.

Also related to event scheduling is the transmission of events from one federate to another in the distributed simulation. This is handled by a newly created ns agent called Agent/RTI. The Agent/RTI is responsible for determining that a received event is destined for a remote federate, preparing a message containing the complete event information, and forwarding that to the remote federate using the RTIKIT MCAST functions. The RTI Agents are automatically inserted on any ns node that has a link to a remote simulator.

Modifications to ns TCL syntax

In addition to the event scheduling modifications mentioned above, the way that a network topology and network data flows are defined by ns need to be enhanced to allow a federated ns simulation. Consider the simple topology shown below. If this simple eight node simulation were run on a single ns, then nodes R0 and R2 and their connecting link are simply defined as:

set r0 [$ns node]
set r2 [$ns node]
$ns duplex-link $r0 $r2 1.5mb 10ms DropTail

But when we decide to run the simulation in a distributed fashion on Simulators A and B as shown below, the definition of the duplex-link is problematic. In simulator A there is no notion of node r2, and in simulator B there is no notion of node r0. We solve this problem by extending the ns syntax to include the specification of a remote link, called an rlink. With our extended ns syntax, the simulated links which are across federates are defined using an rlink command, just specifying the local endpoint, and identifying it with an IP Address. The other end of the simulated link is defined in simulator B, and is also assigned an IP address. At runtime, remote links with matching network addresses are logically connected, and simulated packets which leave simulator A on the rlink are delivered to the corresponding rlink in simulator B. Details on how this is done can be seen in the example script linked below.

Defining data flows in ns is done similarly, defining the two endpoints of the data flow by name. We have the same problem when the remote endpoint is located on another federate, and we solve it in a similar fashion, allowing a remote connection, called an rconnect. With an rconnect command, the remote endpoint of a data connection is specified by IP Address and port number, rather than by ns node and agent name.

Assigning IP Addresses to Links

As shown in the examples, when using pdns you must assign an IP address to any node that will be referenced remotely. In keeping with normal networking practice, the IP addresses are actually assigned to the links, not the nodes. The syntax for assigning an IP address is :

$linkname set-ipaddr address mask

Where: linkname is the ns object for the link being assigned
address is the address being assigned, either in dotted notation, or a 32 bit value.
mask is the corresponding netmask, either in dotted notation or a 32 bit value. The netmask value is not used in the current pdns implementation, but is included for completeness.

Binding Port Numbers to Agents

As shown in the examples, when using pdns you must assign a port number to any agent that will be referenced remotely. Both ends of a TCP connection must have a port number. The syntax for assigning a port number is :

$nodename bind $agent portnum

Where: nodename is the ns object for the node where the agent is being bound.
agent is the ns object for the agent being bound
portnum is the port number desired. If zero is specified, then the next available port number is assigned.

Creating Remote Links

When using pdns, a Remote Link is a ns link where one node endpoint is defined on one instance of pdns, and the second node endpoint is on a different instance of pdns. The syntax to create a remote link is:

$nodename rlink linkspeed linkdelay qtype ipaddr addrmask

Where: nodename is the ns object for the node where the remote link is being defined.
linkspeed is the speed of the link
linkdelay is the propogation delay for the link
qtype is the associated queueing discipline for the link
ipaddr is the IP address of the local end of the link. The other end of the link must have an IP address with an identical network address (the IP address AND'ed with the network mask). Also, one end of the rlink must have a host value of ".1".
addrmask is the mask value for the network portion of the IP address.

Accessing rlink queues

A Remote Link has an associated queue, just as any link in ns. There are two methods for obtaining the queue object:

$ns rqueue $node ipaddr

Where: $ns is the Simulator object
$node is the node containing the remote link
$ipaddr is the IP Address of the remote link

$node get-rqueue ipaddr

Where: $node is the node containing the remote link
$ipaddr is the IP Address of the remote link

Current Status (November 1999)

We have a working version of PDNS which has been tested on as many as eight systems, with a 2000 node network topology (simulating 250 nodes on each of the eight systems). We have run on both the myrinet interconnected systems and the Ethernet TCP/IP network. The overall CPU speedup on the myrinet systems is fairly good (about a factor of 3 on eight systems), and the speedup on the TCP/IP Ethernet is less than we hoped for (only about a factor of 2 on eight systems). However, we do achieve a linear improvement in the size of network models which can be simulated using ns. We are addressing the performance issues and expect some improvements in later releases.

Download our Code

All of the modifications to existing ns code and newly added ns code is in ns-added-2.1b5.tar.gz. The modifications apply to release 2.1b5 ONLY! (released 16-Mar-99). Be sure you are starting with this release. You will also need the RTIKIT, version 1.0d. Please email George Riley to get the url for the most recent release of each of these. (We also like to know who has downloaded and using our code!) The RTIKIT has precompiled binaries for Sparc Solaris, Intel Solaris and Intel Linux. You can compile the RTIKIT for other systems if needed. The remainder of these instructions assume you have loaded the ns source directory at ~/ns-allinone-2.1b5. If it is elsewhere, be sure to modify these instructions accordingly.

First, un-zip and un-tar the RTIKIT software:

gunzip -c ~/fdk1.0d.tar.gz | tar -xvf -

Then you need to un-zip and un-tar the ns-added-2.1b5.tar.gz file. Do this as follows:

cd ~/ns-allinone-2.1b5
gunzip -c ~/ns-added-2.1b5.tar.gz | tar -xvf -

Next you need to run the rtidiffs script, which will apply some differences to the released ns code:

cd ns-2.1b5
./rtidiffs

Next, edit ~/ns-allinone-2.1b5/ns-2.1b5/Makefile.in, and set the value of the RTIKIT macro to point to the directory where you installed the RTIKIT. Then just configure and install ns normally. The resulting binaries for the Parallel/Distributed ns will be ~/ns-allinone-2.1b5/ns-2.1b5/pdns.

The pdns software uses the RTIKIT Session Manager to coordinate the startup synchronization of the distributed simulation. The session manager uses two environment variables, NODEINFO and SESSIONNAME to coordinate the creation of the inter-process communication sockets. These environment variables are described in detail on the RTIKIT usage documentation. Use sample scripts example1a.tcl, example1b.tcl, and the runsimp script to test out your installation. The runsimp script will also need to be edited to specify the correct hostnames and paths to the scripts and binaries. The example scripts implement the simple distributed simulation shown in the figure below.

Please email George Riley with any questions or comments.

Contact Information:

riley@cc.gatech.edu

College of Computing

Georgia Institute of Technology

Atlanta, GA 30332-0280