Please subscribe to the PDNS mailing list for discussion on PDNS and related announcements.
| Status and Changes: pdns-2.27-v1a (March 16, 2004) [ download :: API ] | |
| 1. Compatibility with ns-2.27, lastest libSynk, and Intel's icc/ecc and gcc-3.2 compilers | |
| 2. GTemulator support (Linux only) | |
| 3. Local -> Remote -> Local routing support, other routing issues fixed | |
| 4. Improved compatibility with 64-bit platforms | |
The publicly available network simulator ns has become a popular and widely used simulator for research in telecommunications networks. However, the design of ns is such that simulation of very large networks is difficult, if not impossible, due to excessive memory and CPU time requirements. The PADS research group at Georgia Tech has developed extensions and enhancements to the ns simulator to allow a network simulation to be run in a parallel and distributed fashion, on a network of workstations.
We set out to provide a means for ns users to distribute their simulation on
several (e.g. 8-16) workstations connected either via a Myrinet network,
or a standard Ethernet network using the TCP/IP protocol stack. By
distributing the network model on several machines, the memory requirements
on any single system can be substantially smaller than the memory
used in a single-workstation simulation. The overall execution time of the
simulation should be at least as fast at the original single-workstation
simulation, and can be several times faster. In all cases we can support
proportionally larger network models by distributing the model on multiple
systems.
A key goal was to minimize the number of modifications required to the
released ns source code and to insure that all existing ns simulations would
still run properly when used with our modified ns. Minimizing the number of
changes to ns allows the parallel simulator to readily take advantage of new,
improved versions of ns as they become available. Any new or revised ns
syntax should be directly related to the parallelization of the simulation,
and does not affect ns users who are not using PDNS.
In order to achieve the goal of limited modifications to the base ns
software, we chose to use a federated simulation approach where separate
instantiations of ns modeling different subnetworks execute on
different processors. PDNS uses a conservative (blocking based) approach to
synchronization. No federate in the parallel simulation will ever process an event that would later have to be undone due to receiving messages in the
simulated past. This avoids the need to implement state saving in the existing ns code.
The PADS research
group at Georgia Tech has previously developed an extensive library of
support software for implementing parallel and distributed simulations (see
libSynk and
RTIKIT). The
sofware has support for global virtual time management, group data
communications, and message buffer management. It has
support for a variety of communication interconnects, including
shared memory, Myrinet and TCP/IP networks, and runs on a variety of
platforms. By using this synchronization software for the parallelization
of ns, we
were able to rapidly modify the main event processing loop of ns to support
the distributed time management functions needed to insure that no unsafe
event is ever processed by any federate.
The modifications needed to ns can be broadly classified in two major
categories, the modifications to the ns event processing infrastructure, and
extensions to the ns TCL script syntax for describing simulations. Each of
those categories are described in detail below.
The standard ns release has several variants of the main event processing loop which can be specified by the ns user as follows:
| $ns use-scheduler Heap |
which specifies that the heap based scheduler should be used. We developed a new event scheduler known as Scheduler/RTI, which is specified by the ns user as follows:
| $ns use-scheduler RTI |
The Scheduler/RTI uses the time management functions of the libSynk/RTIKIT
to insure
that local simulation time advances do not allow for the processing of unsafe
events. The Scheduler/RTI also contains code to process events received by a
federate which were generated by another federate, and places those new events
in the proper location in the event list.
Also related to event scheduling is the transmission of events from one
federate to another in the distributed simulation. This is handled by a newly
created ns agent called Agent/RTI. The Agent/RTI is responsible for
determining that a received event is destined for a remote federate,
preparing a message containing the complete event information, and forwarding
that to the remote federate using the RTI MCAST functions. The RTI Agents
are automatically inserted on any ns node that has a link to a remote
simulator.
In addition to the event scheduling modifications mentioned above, the way
that a network topology and network data flows are defined by ns need to be
enhanced to allow a federated ns simulation. Consider the simple topology
shown below. If this simple eight node simulation were run on a single ns,
then nodes R0 and R2 and their connecting link are simply defined as:
| set r0 [$ns node] |
| set r2 [$ns node] |
| $ns duplex-link $r0 $r2 1.5mb 10ms DropTail |
But when we decide to run the simulation in a distributed fashion on
Simulators A and B as shown below, the definition of the duplex-link is
problematic. In simulator A there is no notion of node r2, and in simulator B
there is no notion of node r0. We solve this problem by extending the ns
syntax to include the specification of a remote link, called an rlink. With
our extended ns syntax, the simulated links which are across federates are
defined using an rlink command, just specifying the local endpoint, and
identifying it with an IP Address. The other end of the simulated link is
defined in simulator B, and is also assigned an IP address. At runtime,
remote links with matching network addresses are logically connected, and
simulated packets which leave simulator A on the rlink are delivered to the
corresponding rlink in simulator B. Details on how this is done can be seen
in the example script linked below.
Defining data flows in ns is done similarly, defining the two endpoints of
the data flow by name. We have the same problem when the remote endpoint is
located on another federate, and we solve it in a similar fashion,
allowing a remote connection, called an ip-connect. With the
ip-connect command, the remote endpoint of a data connection is
specified by IP Address and port number rather than by ns node and
agent name.
As shown in the examples, when using pdns you must assign an IP address to any node that will be referenced remotely. In keeping with normal networking practice, the IP addresses are actually assigned to the links, not the nodes. The syntax for assigning an IP address is:
| $linkname set-ipaddr address mask |
Example:
| [$ns link $n1 $n2] set-ipaddr 10.1.1.1 255.255.255.0 |
| [$ns link $n2 $n1] set-ipaddr 10.1.1.2 255.255.255.0 |
When using pdns, a Remote Link is a ns link where one node endpoint is defined on one instance of pdns, and the second node endpoint is on a different instance of pdns. Note that one end of the remote link must have an IP address ending in ".1". For example, if you create a remote link 192.168.1.1, the other end of the remote link on a different PDNS instance must be 192.168.1.2 (assuming the netmask is 255.255.255.0). This is how PDNS matches remote links between two different PDNS instances. The syntax to create a remote link is:
| $nodename rlink linkspeed linkdelay qtype ipaddr addrmask |
Example:
| $n2 rlink 100Mb 5ms DropTail 192.168.1.1 255.255.255.0 |
When using pdns, since only a portion of the global topology is known to each simulator, there are times when choosing the correct route between nodes defined on different simulators is problematic. In the sample topology above, router R0 has insufficient information to determine which of the two defined remote links is the shortest path to host H0. We solve this problem by explicitly specifying routes for each remote link, as follows:
| $ns add-route $router-node rlink-ip dest-ip dest-mask { src-ip src-mask } |
Example:
| $ns add-route $n2 192.168.1.1 10.1.2.0 255.255.255.0 10.1.1.0 255.255.255.0 |
| $ns add-route $n3 192.168.1.2 10.1.3.0 255.255.255.0 |
In the above example, the first line would route data originating from
subnet 10.1.1.X through the rlink 192.168.1.1 if the destination IP
address is in the subnet 10.1.2.X. The second line would direct PDNS to
route all traffic originating from its instance through
the rlink 192.168.1.2 if the destination IP is in the subnet 10.1.3.X.
As you can see, the source IP and mask are optional; if it is not
specified then src-ip and src-mask will both be set to 0.0.0.0.
Note: If you have agents that produce traffic destined for
other agents within the same PDNS instance but must travel through
remote routes because no local paths exist between the two agents,
you must specify this route path. See the "Sample Scripts"
section below for a simple script that performs local->remote->local
routing.
As shown in the examples, when using pdns you must assign a port number to any agent that will be referenced remotely. Both ends of a TCP connection must have a port number. The syntax for assigning a port number is:
| $nodename bind $agent portnum |
Example:
| $n1 bind $tcp_agent 80 |
In order for PDNS to correctly send data from one end-host to another, the usual ns "connect" command will not suffice. Instead, the following new directive allows PDNS to correctly route traffic destined for remote simulators to the correct local border routers. However, note that the following command can be used for both remote and local connects.
| $ns ip-connect $agent dest-ip port |
Example:
| $ns ip-connect $tcp_agent 192.168.3.4 80 |
Sometimes it is necessary to have dynamic TCP connections created at runtime. PDNS now includes a TCP Listener Agent which dynamically allocates connections on the fly as the simulator is running, instead of determining all connections at initiliazation. TCP Listener closely models real TCP connections (i.e. multiple TCP connections per destination port, a TCP connection table). Of course, the TCP Listener Agent can accept pre-determined connections at initialization time. The underlying TCP protocol behavior is inherited from tcp-full. Once the three-way handshake is complete, TCP Listener promotes the connection to tcp-full status by instantiating tcp-full agents and connecting the source to it automatically. TCP Listener handles all data de-multiplexing. TCP Listeners must be used with agents that have IP addresses associated with them.
| set tcpl [new Agent/TCP/Listener] |
Example:
| set tcpl [new Agent/TCP/Listener] |
| $ns attach-agent $n2 $tcpl |
| $n2 bind $tcpl 80 |
| $tcpl set-application MyAgent/WebServer 0 callback 1 1 |
You can send TCP data to this agent from any PDNS instance via ip-connect. If you need to attach an application (similar to the WebServer above) to the listener, use the set-application command below. Note that the sending tcp-full agent (initiator of the TCP connection) should set close_on_empty_ true so completed connections can be removed from the connection table (active connection teardown):
| $tcp_agent set close_on_empty_ true |
Defaults:
Commands:
You may need to access a remote link queue (i.e. for statistics), just as any link in ns. The method for obtaining the queue object is:
| $ns rqueue $node ipaddr |
Example:
| set $qo [$ns rqueue $n2 192.168.1.1] |
The current version of PDNS has been tested on as many as 136 processors simulating a 600,000+ node network topology. We have run on Myrinet interconnected systems, Ethernet TCP/IP network connected systems, and using shared memory on multiprocessor systems. The overall speedup achieved by running a distributed simulation is dependent on many factors, so it cannot accurately be predicted for any given simulation. In our testing environment, the speedup achieved when running on eight systems varies from about eight (near perfect speedup) to less than one.
All of the modifications to existing ns code only work with the
2.27 release. You will also need the libSynk libraries, which can
be compiled for Intel Linux, Intel Solaris, Sparc Solaris, SGI Irix, HP
UX, Macintosh OS X, and Microsoft Windows systems. These instructions
assume you have loaded the ns source directory at
~/ns-allinone-2.27, and libSynk at ~/libsynk.
If either one is located elsewhere, be sure to modify these instructions
accordingly. Please do not download the PDNS source with Internet
Explorer!
| Source | Release Date |
| ns-2.27 | January 18, 2004 |
| libSynk | December 29, 2003 or later |
| pdns-2.27-v1a | March 16, 2004 |
| pdns-gtemu-2.27-v1a | March 16, 2004 |
Download either pdns-2.27-v1a or pdns-gtemu-2.27-v1a. pdns-2.27-v1a is the most compatible version and can be compiled using Intel's icc/ecc or gcc-3.2 (or previous) series compilers. If you need GTemulator capability, download pdns-gtemu-2.27-v1a, however, please be aware this version is only compatible with Linux systems and gcc.
Decompress libSynk:
| cd $(HOME) |
| gunzip -c libsynk-current.tar.Z | tar -xvpf - |
| cd libsynk |
If you are using Myrinet, you must modify the appropriate makefiles before compiling. For instance, using Myrinet under Linux:
Makefile in ~/libsynk so
CFLAGS contains -DGM_AVAILABLE=1
and LDLIBS contains -lgmMakefile in
~/libsynk/fdkcompat so LDLIBS contains
-lgm-O2,
-O3, or any other optimization parameters for your
architecture.Now create the libraries:
| make |
| cd fdkcompat |
| make |
Decompress the baseline ns software:
| cd $(HOME) |
| gunzip -c ns-allinone-2.27.tar.gz | tar -xvf - |
Move the PDNS patches (pdns_2.27_v1a.gz) file
to the ~/ns-allinone-2.27/ns-2.27 directory and patch the
stock ns:
| cd $(HOME)/ns-allinone-2.27/ns-2.27 |
| gunzip -c pdns_2.27_v1a.gz | patch -p3 |
Next, edit the ~/ns-allinone-2.27/ns-2.27/Makefile.in
file:
KITHOME macro on line 64 to
reflect the correct directory of libSynk.-lgm on line 103 of the Makefile.in
file.-DUSE_COMPRESSION on line
68.Makefile under
CCOPT.Install ns normally:
| cd $(HOME)/ns-allinone-2.27 |
| ./install |
The resulting binary for Parallel/Distributed ns will be
~/ns-allinone-2.27/ns-2.27/pdns.
The pdns software uses libSynk to coordinate the startup
synchronization of the distributed simulation. The startup uses two
environment variables, NODEINFO (and usually
FMTCP_SESSIONNAME for TCP communications) to
coordinate the creation of the inter-process communication (sockets,
shared memory, etc.). These environment variables are described in
detail in the libSynk usage
documentation.
Note that you may specify the initial TSO Queue heap size and increments at
run time through the environment variables: BRTI_HEAPSIZE
and BRTI_HEAPINCR. The default initial heap size is
1000 events and the default increment is 10000 events.
test.tcl requires 3 pdns instances. To run, modify
run_test.sh to reflect the correct hostname(s) to be
used. Note the use of export for bash-style shells,
modify appropriately for csh, tcsh, etc. You do not need 3 machines to
run 3 pdns instances for this small script. You can place the same
hostname 3 times in the NODEINFO string (and the
script will ssh 3 times to the same host to spawn 3 pdns instances
accordingly). The file out.0 should have 5 dlink hops, 3
rlink hops, 8 total hops. The file out.1 should have 5
dlink hops, 2 rlink hops, 7 total hops. The file out.2
should have 0 dlink hops, 0 rlink hops, 0 total hops. dlink hops
stands for duplex-link hops or local hops within a pdns instance,
while rlink hops stands for remote-link hops or hops that span two
pdns instances. Note that rlink hops invoke libSynk's communication
primivites which are one of or a combination of shared memory, TCP/IP,
or Myrinet.
lrl.tcl only requires 2 pdns instances. Again, modify
run_lrl.sh to your enviornment. The file out.0
should have 10 dlink hops, 5 rlink hops, 15 total hops. The file
out.1 should have 0 dlink hops, 5 rlink hops, 5 total
hops.
This release of pdns also includes parts of emulation support for routing
real application traffic via pdns networks (designed and developed
by Kalyan Perumalla). This
is achieved using the Veil emulation framework, which captures the
application data at the socket API layer. The data capture portion is
included in the veil overloading library, while the data
injection/emitting code is included in pdns as the
LiveApplication Agent in
rti/liveapp.cc. For additional information, see the Veil homepage.