MCS-377 Lab 2: Transport Layer (Fall 2004)

Due: October 22, 2004

Objective

You and a partner will analyze the packets exchanged between pairs of computers that are using TCP to transfer a file. Because the network between each pair of computers has been specially configured to randomly discard some fraction of the packets they carry, you will be able in particular to observe TCP's response to lost segments. Because we have ten packet traces from a network using one loss rate and ten other from a network with a different loss rate, you will be able to observe the impact of loss rate on throughput.

Packet traces

Students in spring of 2002 used a packet sniffer to capture the same 114602-byte file being transfered 20 times: 10 times on network A and 10 times on network B, where the two networks differ in their (artificially introduced) loss rates. The sniffer they used, tcpdump, is more primitive than ethereal, but luckily stores its packet trace files in the same format (because it is based on the same packet capture library). Thus, you'll be able to use ethereal for your analysis, as described in the next section.

The file transfer was done using raw TCP, not some application-level protocol such as FTP or HTTP. Therefore, the only data contained in the TCP segments are the file's bytes themselves, not any commands or status codes.

The students used a feature of tcpdump which saves only the first 68 bytes of each packet to the file. This keeps the trace files smaller, but still includes all the header information you need. However, you cannot expect to see the full packet contents, beyond the headers, in ethereal. In particular, you would not be able to reconstruct the file that was being transfered.

The networks (A and B) were both configured in a rather unusual fashion:

                                        discard
                                           ^
                                           | x%
                                      +----------+
                       sniffer here   | random   |
+---------+                       |   | packet   | 100-x%  +-----------+
| Sending |-----------------------+-->| selector |-------->| Receiving |
|   PC    |        +----------+       +----------+         |    PC     |
|         |<-------| random   |<--+------------------------|           |
+---------+ 100-x% | packet   |   |                        +-----------+
                   | selector |   and sniffer here too
                   +----------+
                        | x%
                        V
                     discard

This means that the packet traces you have from the sniffer contain all packets that either PC transmitted (recall, the receiver transmits ACKs). However, not all of those packets made it to their destination, with effects you will see. (In more realistic settings, the sniffer is typically downstream from some of the packet loss, and so doesn't capture all transmitted packets.)

Network A and network B differ in what the percentage of discarded (i.e., artificially lost) packets is, shown as x% in the diagram. Both directions of traffic are subject to the same loss rate in a particular network.

The packet traces are all contained in the directory /Net/solen/home/m/a/max/www-docs/courses/F2004/MCS-377/labs/lab2/traces/ . The ten from network A are a01, a02, a03, a04, a05, a06, a07, a08, a09, and a10. Similarly, the ten from network B are b01, b02, b03, b04, b05, b06, b07, b08, b09, and b10.

As described in the next section, you will do some broad analysis of all 20 traces, and then more in-depth analysis on one particular pair of traces. I will assign each lab group a number from 1 to 10. If you are lab group 5, for example, you will do your in-depth analysis on traces a05 and b05. We can later pool together the whole class's results from the in-depth analyses.

Data extraction

You will analyze the previously captured packet traces using ethereal. When you start up ethereal, it may ask you for the root password, which is needed for packet capture. Since we will be using pre-captured files, you can click on the button that says to run unprivileged.

When you open up one of the trace files in ethereal, it displays a scrollable list of all the packets, including initial connection setup, the transmission (and sometimes retransmission) of data and its acknowledgments, and the connection tear-down. Because of a quirk of the way the packet traces were collected, the connection is torn down considerably after the last of the data has been transfered and acknowledged. Therefore, you should not measure the entire duration of the TCP connection; it would be unrealistically long. Instead, you will use the time from the first data-containing packet to the time of the last.

To focus on just the data-containing packets, you can apply a display filter, tcp.len > 0. After doing this, you can locate three facts of interest:

Next, from the Statistics menu you can select TCP Stream Graph and from that submenu select Time-Sequence Graph (Stevens). By examining the resulting graph for horizontal gaps, you can identify timeouts that occurred. (I will show an example of this in class.) You should count the number of timeouts that occur in each trace file. Note that there is a huge horizontal gap between the last data transmission and the connection shutdown, due to the previously mentioned trace-collection artifact; you should not count this as a timeout. (Alternatively, you could look for the timeouts while scrolling through the list of packets. However, I think it is valuable to see the visual representation of the data, to get some qualitative feel for how the timeouts are sized and located.)

The final piece of information you should collect from all twenty trace files is the number of data-containing packets that are retransmissions. Some of these are triggered by the timeouts you just identified, but others are triggered by repeated duplicate acknowledgments, i.e., the fast retransmit feature of TCP. Ethereal contains a TCP analysis module which, if it were working correctly, would identify packets as being "retransmissions" or "fast retransmissions." However, this feature is currently buggy. This has two consequences:

Since our packet traces do not contain any genuinely out of order segments, you can find the total number of data-containing packets that are retransmissions simply by counting the total of those that are marked as "out of order" and those that are marked as either kind of retransmission. Actually, you don't have to do the counting, as ethereal will do it for you.

The first step is to apply another display filter to show just those packets that contain data and are marked as either retransmissions or out of order. (Packets marked as fast retransmissions are also marked as retransmissions, so they will be included.) This display filter can be expressed as

tcp.len > 0 && (tcp.analysis.retransmission || tcp.analysis.out_of_order)

Having done this, the "D:" number at the bottom of the window should show the quantity you are to record.

(Incidentally, ethereal is open-source software, so fixing its bugs would make a nice project for a student looking to make a contribution.)

For the two trace files that are specific to your lab group, you should clear any filtering you have applied and then examine the packets leading up to each timeout, and the first packet retransmitted upon each timeout, in order to determine the following quantities:

Data analysis

For each of the 20 traces, compute the elapsed time from first data-containing packet to last, and use that to compute the average throughput. (Recall that 114602 bytes were transmitted.)

One key question is whether the two networks have significantly different throughputs. Looking at the numbers, what can you say qualitatively? Are those from one network uniformly smaller than from the other, or is there some overlap? If there is overlap, do the two groups of numbers still seem to be clearly distinguishable, or might they all plausibly be coming from the same population? You may also want to look at the numbers graphically. For example, you could do a scatter plot, with the throughputs as the y-coordinate and a network number (1 for A, 2 for B) as the x-coordinate. This will give you two vertical columns of points, one for each network. You can then see how different they look.

To get a quantitative handle on whether the throughputs might plausibly have come from a single population, you can use a statistical test. The simplest applicable test would be Fisher's exact test for 2x2 contingency tables. The idea is to identify which ten throughputs are the smallest ten and which ten are the largest ten. Call these "small" and "large" throughputs respectively. Then make a 2x2 table containing four counts: how many small throughputs were measured on network A, how many small ones on network B, how many large ones on A, and how many large ones on B. You can arrange this as follows:

small Asmall B
large Alarge B

Now take these four numbers and plug them into a program for computing Fisher's exact test. If the resulting two-tailed p value is very small, then you have strong evidence that networks A and B really do have different throughputs. You can find programs for this test on the web, in the form of interactive web pages.

Next, you can calculate the fraction of data-containing packets that are retransmissions, and the ratio of timeouts to data-containing packets. Are the two networks distinct in these regards? If they are, is the network with more retransmissions the same as the one with more timeouts and the same as the one with lower throughput, as one would expect?

For the two traces where you examined the timeouts in depth, calculate the total time spent in the timeouts. Suppose these timeouts were eliminated, but the traces were otherwise unchanged. How much higher would each of the throughputs have been?

Also, for the two traces you examined in depth, calculate an estimate of the packet loss rate (i.e., the x% from the diagram at the start of this lab description). You identified the retransmissions that resulted from lost ACKs. Assume that the other retransmissions correspond one-to-one with lost data-containing packets. (This is only approximately true.) Thus, by subtracting the number of lost-ACK timeouts from the number of retransmissions, you have an approximate number of lost data-containing packets. Divide this by the total number of data-containing packets to get a percentage. We will later pool these estimates together to get class-wide aggregate estimates of network A's and network B's loss rates.

Report

Be sure that your report does not assume the reader already knows what you did. You may assume reasonable background knowledge of networking, and should refer to external sources of information (such as RFCs or program documentation) where appropriate. Present quantitative data clearly, using well formatted tables (e.g., align decimal points) and graphs.


Course web site: http://www.gac.edu/~max/courses/F2004/MCS-377/
Instructor: Max Hailperin <max@gac.edu>