CS6250: COMPUTER NETWORKS

Project description

Design and implementation of a reliable transport protocol
over an unreliable and dynamic datagram network

Revision 2 (Sept 24, 2002)


Introduction

In this project you will design and implement your own network protocol hierarchy! This is a challenging and exciting task, that is necessary for understanding computer networks in depth.

In this handout, we give you a rather loose specification for the functionality of the protocols that you will design. You have significant flexibility in adjusting this specification to your own `protocol-design taste'. Even though there are large similarities between the following specification and the TCP/IP protocols, keep in mind that there are also significant differences.


Emulated network topology

The topology of the emulated network, i.e., the network nodes and the interconnecting links, will be specified by an Initial Topology Configuration (ITC) script. An example of an ITC script for a network with 4 nodes and 6 unidirectional links follows:

1 akron.cc.gatech.edu 17878 2 3
2 akron.cc.gatech.edu 17879 4
3 boise.cc.gatech.edu 17879 2 4
4 dollas.cc.gatech.edu 17890 1

Each line of the ITC corresponds to a network node X. The first argument is the unique Node ID (NID) of node X, the second is the name of the host that will be emulating X, the third argument is the port number used for all communications to/from X, while the following arguments are the NIDs of the nodes that X is connected to. Note that the links are unidirectional. In the previous example for instance, node-1 can send messages to node-2, but it cannot receive messages directly from node-2.

The ITC script will follow the format and structure of the previous example. It will not include any comments, but it may include arbitrary blank spaces between successive arguments, or blank lines.

Suppose that your executable program is called net6250, and the previous ITC script is called itc_1. To emulate the network of the previous example you would need to run four instances of net6250, as shown next. Note that in each command line you have to specify the NID of the corresponding node, as well as the filename of the ITC script.
At akron.cc.gatech.edu.edu run: net6250 1 itc_1
At akron.cc.gatech.edu.edu run: net6250 2 itc_1
At boise.cc.gatech.edu.edu run: net6250 3 itc_1
At dollas.cc.gatech.edu.edu run: net6250 4 itc_1

Note that the ITC script defines the initial network topology. After initialization, your routing protocol should be able to detect topology changes and react to them determining new shortest-path routes. Two types of topology changes can occur:

  1. A node failure, meaning that the process that emulates a network node, together with all its links, is aborted.
  2. A link failure, meaning that a particular link is disabled.
Both failures can be permanent or temporary (i.e., an aborted node or link can be restarted). There will be no introduction of nodes or links that were not present in the original topology.

Emulation of point-to-point links

The point-to-point links specified at the ITC script will be emulated passing UDP/IP messages between two directly connected network nodes. For instance, if node-1 is connected to node-2, any message that node-1 sends to node-2 will have to be encapsulated in a UDP/IP packet and send to node-2 through a UDP socket.

All point-to-point links will have a Maximum Transmission Unit (MTU) size of 1100 bytes (excluding the UDP/IP headers).

Given that the local area network of CoC is quite reliable, we would not expect to see any network errors (which makes the project less interesting!). To make things more challenging, we will introduce a ``garbler'' module between your network layer protocol and UDP/IP. The garbler will be introducing, according to a certain probability, the following types of errors whenever you send a packet from one node to another:

  1. Packet loss (the entire packet is lost)
  2. Packet corruption (one or more bit errors)
  3. Packet duplication

The garbler implementation will be provided by us, and it will have to be linked to your code. To use the garbler, you should replace each instance of the socket system call sendto with our ``garbled'' version sendto_garbled. The two functions have exactly the same arguments.
NOTE: Besides the garbled UDP sockets, you should not use any other communication primitives in this project.


The network and the network-control protocols

The network protocol will be using the emulated point-to-point links to transfer packets from one network node to another, even when the two nodes are not directly connected. This network layer functionality will be offered by a Datagram Network Protocol (DNP) that you will design. DNP should have the following salient features:

DNP should also offer the following two network error reporting services. These two services can be thought of as an additional protocol, called Datagram Network Control Protocol (DNCP).

  1. When a packet cannot be routed to its destination because the destination node is currently unreachable, DNCP should reply to the original sender of the packet with a ``node-unreachable'' message. The goal of this message is to notify the sender that the packet transfer was not currently possible because the destination node is unreachable.
  2. When a packet's TTL reaches zero, the network node that discards the packet should reply to the original sender of the packet with a second type of DNCP message. The goal of this ``TTL-expired'' message is to notify the sender that the packet transfer was not possible because the TTL expired. The TTL-expired message should include the NID of the node that discarded the packet.


The routing protocol

Each network node will be maintaining a routing table. The routing table gives the next-hop for each destination node. For instance, node X may learn from its routing table that a packet destined to node Y should be first forwarded to node Z (X must be directly connected to Z for this to happen). Note that DNP at node X does not know the entire route that the packet will follow to its destination Y; it simply knows what the next hop should be node Z.

The routing table will be constructed by a Link-state Routing Protocol (LRP) that you will design. LRP should have the following salient features:

Keep in mind that the LRP packets are also subject to being ``garbled'' when sent to the network. Specifically, LRP packets have to be encapsulated in DNP packets, and DNP is not reliable. So, it is important to provide LRP with reliability to losses, errors, and duplication.

You can assume that the number of network nodes will be up to 32, the number of network links up to 128, and the number of links per node up to 8.


The transport protocol

DNP offers an end-to-end packet delivery service, but it is unreliable, it does not provide flow-control, and it does not provide byte-stream communication semantics. This is the task of the Reliable Transport Protocol (RTP) that you will design.

RTP should have the following salient features:

Note: an important design decision is whether RTP will use sequence numbers based on bytes or packets. Packet-based sequence numbers are simpler, but they require fixed-length packets.


The application

At the highest layer, your project will implement a simple file transfer application. This application, called Non-blocking File Downloads (NFD) will be running at each node of the emulated network topology.

NFD will provide the user with a command-line interface. This interface should be non-blocking, in the sense that each command should return before the corresponding operation (e.g., a file transfer) is actually completed. Note that the non-blocking nature of the application is of major importance for your implementation. It means, besides other things, that your implementation will have to be multi-threaded, with one thread for each ongoing transfer.

The NFD interface will support establishment of connections to remote nodes, transfers of files, control of the window size for each connection, and a few other ``network monitoring and management'' operations.

Specifically, the following list gives the operations that the command-line interface should support. Notice that all commands and their arguments should be case-insensitive, and that arbitrary blank spaces/lines are allowed.

  1. start_service(P): this command establishes a service point at the local node, say X. The command returns a Service ID (SID) S, that is unique for each service point of node X. Other nodes can connect to S, and download files from that service point. The argument P is the maximum number of open connections that S can accept; requests for more than P connections are rejected.
    Examples of resulting messages:
    "Success: SID=123, MaxCons=3"
    "Failure: MaxCons=0 bad argument"

  2. stop_service(S): this command terminates the service point with SID S at the local node. Any open connections at that SID are aborted.
    Examples of resulting messages:
    "Success: SID=123 is terminated"
    "Failure: SID=123 does not exist"

  3. connect(Y, S, W): this command establishes a connection from the local node, say X, to the service point with SID S of the remote node Y. The window for the connection should be W (a multiple of 1000 bytes). The command returns a Connection ID (CID) C, that is unique for each open connection at service point S.
    Examples of resulting messages:
    "Success: DestNode=2, SID=123, CID=3, Win=15000B"
    "Failure: DestNode=2 does not exist or unreachable"
    "Failure: SID=123 does not exist"
    "Failure: SID=123 rejected connection request"
    "Failure: Win=260 bad argument"

  4. close(C): this command closes the connection with CID C.
    Examples of resulting messages:
    "Success: CID=3 is closed"
    "Failure: CID=3 is not open"
    "Failure: CID=3 cannot be closed"

  5. download(C, F): this command downloads a file with name F through the connection with CID C. The transfer is possible only if that connection is not currently busy with another download operation. The file F should be present at the current working directory of the process running at the remote end of the connection. After the transfer is completed, a message should report the duration, average throughput, and Round-Trip Time (RTT) of the transfer. The same connection can be used for multiple download operations.
    Examples of resulting messages:
    "Success: File=foo has been downloaded (duration=3sec, throughput=160KBps, RTT=100msec)"
    "Failure: File=foo does not exist"
    "Failure: CID=3 does not exist"
    "Failure: CID=3 is busy with other transfer"
    "Failure: CID=3 is broken"

  6. set_garbler(L, C, D): this command controls the garbler characteristics at the local network node. All arguments of this command are in integer format.
    Examples of resulting messages:
    "Success: Loss=1%, Corruption=1%, Duplication=2%"
    "Failure: Loss=120% bad argument" NOTE: We will provide you with the code for this command.

  7. route_table: this command prints the local routing table, i.e., for each network node, show the corresponding next-hop and the cost of the route in terms of number of hops.

  8. traceroute(N): this command prints the route, as a sequence of nodes, from the local node to node N. Note that the implementation of this command requires using the TTL DNP header field, and the functionality of DNCP.
    Examples of resulting messages:
    "Success: route to node-1: 2,4,6,1
    "Failure: node-1 does not exist"
    "Failure: node-1 is unreachable"

  9. link_down(N): this command disables the link from the local node to node N.
    Examples of resulting messages:
    "Success: link to node-1 is down"
    "Failure: link to node-1 does not exist"
    "Failure: link to node-1 is already down"

  10. link_up(N): this command enables the link from the local node to node N.
    Examples of resulting messages:
    "Success: link to node-1 is up"
    "Failure: link to node-1 does not exist"
    "Failure: link to node-1 is already up"

  11. debug: this command is optional. You will find it useful to have such a command, however, for debugging and monitoring your project. It can report all important information about the local node, such as active service points, opened connections, varius statistics, etc.

  12. exit: exit the program gracefully.

    Design report

    The first part of this project will be to design your overall system, and write a design report. This report will guide your implementation later, and coordinate the group's efforts. The importance of starting with a thorough design cannot be overemphasized!

    A major part of the design is the specification of all the required protocols: DNP, DNCP, LRP, RTP, and NFD. The specification of a protocol usually involves:

    Together with the specification of each protocol, your report should also describe how do you plan to implement each protocol. For instance, your implementation plan should describe things such as:

    Your report has to also include a detailed time schedule (say weekly) that you plan to follow in implementing this project. Keep in mind that large software projects such as this require a significant amount of testing and debugging after all pieces are brought together. Our suggestion is to reserve at least 20% of the overall time-frame for testing and debugging of the entire system. Also, you may want to introduce major implementation milestones. For example, you can try to have the complete project running, without the garbler, by Nov 1.

    Finally, your report should show how do you plan to distribute each implementation task among the group members. A good approach is to have two persons being responsible for each piece of code, even if one of them is the main programmer.


    Milestones, deliverables, and grading

    The project will have to adhere to the following milestones and deadlines.

    1. Form 4-member groups (email instructor): Sep 10.
    2. Design report: Oct 1 (due in class).
    3. Midpoint progress report: Nov 5.
    4. Final code electronic submission: Dec 2, 5:00pm.
    5. Demos (30 min per group): Dec 5 and 6 (tentative).

    The midpoint progress reports will be informal interviews of each group with the instructors to talk about the progress of the implementation and about any particular problems that the group faces.

    The project accounts for 40% of the overall course grade. The grade for the project will be determined at the end of the semester taking into account both the design report and the resulting implementation.