CS4251: COMPUTER NETWORKS II

Project description

Design and implementation of a reliable transport protocol
over an unreliable and dynamic datagram network


Introduction

In this project you will design and implement your own network protocols! This is a challenging and exciting task that is necessary for understanding computer networks in depth.

In this handout, we give you a rather loose specification for the functionality of the protocols that you will design. You have significant flexibility in adjusting this specification to your own "protocol-design taste". Even though there are large similarities between the following specification and the TCP/IP protocols, keep in mind that there are also significant differences.

You will work on this term-project in groups of 3 or 4 students. The selection of group members is very important. Note that your final project grade will be the same for all the members of a group. Also note that if you do not select group partners by the given deadline (August 26), we will have to select your group members randomly.

The implementation of the project can be done in Java or C/C++.


Emulated network topology

The topology of the emulated network, i.e., the network nodes and the interconnecting links, will be specified by an Initial Topology Configuration (ITC) script. An example of an ITC script for a network with 4 nodes and 4 bi-directional links follows:

1 akron.cc.gatech.edu 17878 2 3 1000
2 akron.cc.gatech.edu 17879 1 4 1200
3 boise.cc.gatech.edu 17879 1 4 600
4 dallas.cc.gatech.edu 17890 2 3 800

Each line of the ITC corresponds to a network node X. The arguments are as follows:

  1. The unique Node ID (NID) of node X
  2. The name of the host that will be emulating X
  3. The UDP port number used for all communications to/from X
  4. The NID of the first node that X is connected to
  5. The NID of the second node that X is connected to
  6. The link's Maximum Transmission Unit (MTU) in bytes

The ITC script will follow the format and structure of the previous example. It will not include any comments, but it may include arbitrary blank spaces between successive arguments, or blank lines.

Suppose that your executable program is called net4251, and the previous ITC script is called itc_1. To emulate the network of the previous example you would need to run four instances of net4251, as shown next. Note that net4251 takes 2 arguments- NID of the corresponding node and the filename of the ITC script.
At akron.cc.gatech.edu.edu run: net4251 1 itc_1
At akron.cc.gatech.edu.edu run: net4251 2 itc_1
At boise.cc.gatech.edu.edu run: net4251 3 itc_1
At dallas.cc.gatech.edu.edu run: net4251 4 itc_1

Note that the ITC script defines the initial network topology. This initial network will be connected, i.e., there will be a path from each node to all other nodes. Reading the ITC script, each node will identify its neighbors in the network. After this initialization, your routing protocol should be able to find the other nodes in the network and the next hop to these nodes. The routing protocol should also detect topology changes and react to them by determining new shortest-path routes. Two types of topology changes can occur:

  1. A node failure, meaning that the process that emulates a network node, together with all its links, is aborted. All the corresponding links to/from that node are disabled
  2. A link failure, meaning that a particular link is disabled.
Both failures can be permanent or temporary (i.e., an aborted node or link can be restarted). There will be no introduction of nodes or links that were not present in the original topology.

Emulation of point-to-point links

The point-to-point links specified at the ITC script will be emulated by passing UDP/IP messages between two directly connected network nodes. For instance, if node-1 is connected to node-2, any message that node-1 sends to node-2 will have to be encapsulated in a UDP/IP packet and sent to node-2 through a UDP socket.

The MTU of each link is specified in their ITC script. Note that the MTU does not include the UDP/IP headers that will encapsulate your network-layer packets. The presence of different links with different MTUs means that you need to support network-layer packet fragmentation.

Given that the local area network of CoC is quite reliable, we do not expect to see any network errors (which makes the project less interesting!). To make things more challenging, we will introduce a ``garbler'' module between your network layer protocol and UDP/IP. Whenever you send a packet from one node to another, the garbler will be introducing, according to a certain probability, the following types of errors :

  1. Packet loss (the entire packet is lost)
  2. Packet corruption (one or more bit errors)
  3. Packet duplication

The garbler module source code will be provided by us, and it will have to be integrated with your code. To use the garbler, you should replace each instance of the socket system call sendto with our ``garbled'' version sendto_garbled. The two functions have exactly the same arguments.
NOTE: Besides the garbled UDP sockets, you should not use any other communication primitives in this project.


The network protocol

The network protocol will be using the emulated point-to-point links to transfer packets from one network node to another, even when the two nodes are not directly connected. This network layer functionality will be offered by a Datagram Network Protocol (DNP) that you will design. DNP should have the following salient features:


The routing protocol

Each network node will maintain a routing table. The routing table gives the next-hop for each destination node. For instance, node X may learn from its routing table that a packet destined to node Y should be first forwarded to node Z (X must be directly connected to Z for this to happen). Note that DNP at node X does not know the entire route that the packet will follow to its destination Y; it simply knows what the next hop should be node Z.

The routing table will be constructed by a Distance Vector Routing Protocol (DVRP) that you will design. DVRP should have the following salient features:

Keep in mind that the DVRP packets are also subject to being ``garbled'' when sent to the network. Specifically, DVRP packets have to be encapsulated in DNP packets, and DNP is not reliable. So, it is important to provide DVRP with reliability to minimize losses, errors, and duplication.

You can assume that the number of network nodes will be up to 32, the number of network links up to 128, and the number of links per node up to 8.


The transport protocol

DNP offers an end-to-end packet delivery service, but it is unreliable, it does not provide flow-control, and it does not provide byte-stream communication semantics. This is the task of the Reliable Transport Protocol (RTP) that you will design.

RTP should have the following salient features:


The application

At the highest layer, your project will implement a simple file transfer application. This application, called Non-blocking File Downloads (NFD) will be running at each node of the emulated network topology.

NFD will provide the user with a command-line interface. This interface should be non-blocking, in the sense that each command should return before the corresponding operation (e.g., a file transfer) is actually completed. Note that the non-blocking nature of the application is of major importance for your implementation. It means, besides other things, that your implementation will have to be multi-threaded, with one thread for each ongoing transfer.

The NFD interface will support establishment of connections to remote nodes, transfers of files, control of the window size for each connection, and a few other ``network management'' operations.

Specifically, the following list gives the operations that the command-line interface should support. Notice that all commands and their arguments should be case-insensitive, and that arbitrary blank spaces/lines are allowed.

  1. start_service(P): this command establishes a service point at the local node, say X. The command returns a Service ID (SID) S, that is unique for each service point of node X. Other nodes can connect to S, and download files from that service point. The argument P is the maximum number of open connections that S can accept; requests for more than P connections are rejected.
    Examples of resulting messages:
    "start_service(3) - Success: SID=123, MaxCons=3"
    "start_service(0) - Failure: MaxCons=0 bad argument"

  2. stop_service(S): this command terminates the service point with SID S at the local node. Any open connections at that SID are aborted.
    Examples of resulting messages:
    "stop_service(123) - Success: SID=123 is terminated"
    "stop_service(200) - Failure: SID=200 does not exist"

  3. connect(Y, S, W): this command establishes a connection from the local node, say X, to the service point with SID S of the remote node Y. The window for the connection should be W. The command returns a Connection ID (CID) C, that is unique for each open connection at the client.
    Examples of resulting messages:
    "connect(2, 123, 15000) - Success: DestNode=2, SID=123, CID=3, Win=15000B"
    "connect(3, 123, 15000) - Failure: DestNode=3 connection request timed out"
    "connect(3, 123, 15000) - Failure: DestNode=3 rejected connection request"

  4. close(C): this command closes the connection with CID C.
    Examples of resulting messages:
    "close(3) - Success: CID=3 is closed"
    "close(7) - Failure: CID=7 is not open"
    "close(8) - Failure: CID=8 cannot be closed"

  5. download(C, F): this command downloads a file with name F through the connection with CID C. The transfer is possible only if that connection is not currently busy with another download operation. The file F should be present at the current working directory of the process running at the remote end of the connection. After the transfer is completed, a message should report the duration, average throughput, and Round-Trip Time (RTT) of the transfer. The same connection can be used for multiple download operations.
    Examples of resulting messages:
    "download(4, foo) - Success: File=foo has been downloaded (duration=3sec, throughput=160KBps)"
    "download(4, bar) - Failure: File=bar does not exist"
    "download(5, foo) - Failure: CID=5 does not exist"
    "download(6, bar) - Failure: CID=6 is busy with other transfer"
    "download(7, bar) - Failure: CID=7 is broken"

  6. set_garbler(L, C, D): this command controls the garbler characteristics at the local network node. All arguments of this command are in integer format.
    Examples of resulting messages:
    "set_garbler(1, 1, 2) - Success: Loss=1%, Corruption=1%, Duplication=2%"
    "set_garbler(120, 1, 2) - Failure: Loss=120% bad argument"
    NOTE: We will provide you with the code for this command.

  7. route_table: this command prints the local routing table, i.e., for each network node, show the corresponding next-hop and the cost of the route in terms of number of hops.

  8. link_down(N): this command disables the link from the local node to node N.
    Examples of resulting messages:
    "link_down(2) - Success: link to node-2 is down"
    "link_down(7) - Failure: link to node-7 does not exist"
    "link_down(3) - Failure: link to node-3 is already down"

  9. link_up(N): this command enables the link from the local node to node N.
    Examples of resulting messages:
    "link_up(2) - Success: link to node-2 is up"
    "link_up(7) - Failure: link to node-7 does not exist"
    "link_up(3) - Failure: link to node-3 is already up"

  10. debug: this command is optional. You will find it useful to have such a command, however, for debugging and monitoring your project. It can report all important information about the local node, such as active service points, opened connections, various statistics, etc.

  11. exit: exit the program gracefully.

Design report

The first milestone of this project will be to design your overall system, and write a design report. This report will guide your implementation later, and coordinate the group's efforts. The importance of starting with a thorough design cannot be overemphasized!

A major part of the design is the specification of all the required protocols: DNP, DVRP , RTP, and NFD. The specification of a protocol usually involves:

Together with the specification of each protocol, your report should also describe how do you plan to implement each protocol. For instance, your implementation plan should describe things such as:

Your report has to also include a detailed time schedule (say weekly) that you plan to follow in implementing this project. Keep in mind that large software projects such as this require a significant amount of testing and debugging after all pieces are brought together. Our suggestion is to reserve at least 20% of the overall time-frame for testing and debugging of the entire system. Also, you may want to introduce major implementation milestones. For example, you can try to have the complete project running, without the garbler, by Apr 1.

Finally, your report should show how do you plan to distribute each implementation task among the group members. A good approach is to have two persons being responsible for each piece of code, even if one of them is the main programmer.


Milestones, deliverables, and grading

The project will have to adhere to the following milestones and deadlines.

  1. Form project groups: Aug 26 (email TA).
  2. Design report: Sep 28 (due in class).
  3. Electronic submission of basic implementation: Nov 4 (followed by demos)
  4. Electronic submission of complete implementation: Dec 3, 5:00pm (followed by demos)

The basic implementation should include the entire protocol suite, but without the following features:

  1. There will be no changes in the topology (i.e., you can ignore node/link failures).
  2. There will be no garbler (i.e., you can ignore the possibility of losses/errors).
  3. All links will have the same MTU (i.e., you can ignore fragmentation).

The project accounts for 35% of the overall course grade. The grade for the project will be determined at the end of the semester taking into account both the design report and the functionality of the resulting implementation (as that was evaluated during the two demos). The performance of your implementation, in terms of throughput or routing convergence delay, cannot affect your grade negatively, unless if your protocols perform extremely slow.