MCS-394 Lab 2: Transport Layer (Spring 2002)

Due: March 27, 2002

Objective

You and a partner will capture and analyze the packets exchanged between two computers that are using TCP to transfer a file. Because the computers have been specially configured to randomly discard some fraction of the TCP segments they receive, you will be able in particular to observe TCP's response to lost segments. You will identify interesting events within the packet trace, estimate the loss rate, and calculate the delivered data transfer rate. Hopefully you will have enough time to do this for more than one loss rate.

Data Acquisition

You will use either of two cluster of three computers to gather your data. Later, if you have time after analyzing your first packet trace, you can use the other cluster to gather a second set of packets. (The two clusters are configured with different loss rates.)

Both clusters of machines are located in the back right-hand corner of OHS 329, as seen entering the room from the hallway. Each machine has a piece of masking tape on it identifying the machine's IP address and indicating that it is for MCS-394 only. The machines in one cluster are 10.0.0.1, 10.0.0.2, and 10.0.0.3. The machines in the other cluster are 10.0.0.4, 10.0.0.5, and 10.0.0.6. Each cluster also has an ethernet hub, to which the three computers in that cluster are connected.

Within each of the clusters, the three machines are configured identically (aside from IP address). Therefore, you can use any of the three to send the file, any of the remaining two to receive the file, and the third computer to do the packet trace capture. When sending the file, you will need to specify the IP address to send it to, so be sure to note that address. You will need to log into each machine as root (the "super user"), with the password I divulge in class. (Actually, in the first lab period, we are likely to have a whole queue of lab groups using the machines, so it will make sense to leave them logged in.) For the packet capture program to work, you need the super-user privileges on that machine. The TCP sending and receiving could be done perfectly well as a normal user, except that I haven't bothered to create any normal user accounts on these machines.

The order in which you give the commands is somewhat important; you need to have the file-receiving command and the packet-capturing command running before you do the file-sending command. (Otherwise, you will get a "connection refused" error, if you aren't running the receiving program, or won't capture all the packets, if you aren't running the capturing program.)

On the machine where you want to receive the file, you will use the nc program, also known as netcat. This is a very general program for communicating using TCP (or UDP). You can look at the documentation for the full story, but a suitable command line would be

nc -n -l -p 6789 >/dev/null

This will listen for a connection on port 6789 and put everything received into the "file" /dev/null. (The special "file" /dev/null isn't actually a file at all, but rather a bottomless pit into which bytes can be put to discard them. If you really wanted a copy of the sent file, you could redirect output wherever you wanted the copy.)

On the machine where you want to capture the packets, you will use the tcpdump command. Again, this program has lots of options, which you can read about in the documentation. All you want to do now is capture the packets in a raw, binary, form; you can do the human-readable analysis later on a different computer. Therefore, a command like the following will suffice:

tcpdump -w /tmp/trace1

Note the 1 on the end of the filename; my suggestion is that each time you capture a trace, you increment this number, while keeping notes somewhere of what circumstances each trace was collected under.

On the machine where you are sending the file, you can use nc again, but with different command line. You could use a command like

nc -n -w 3 10.0.0.x 6789 </usr/lib/libcrypt.a

but with the x replaced by the appropriate number (in the range 1-6) to complete the IP address of the receiving machine. This will open a TCP connection to port 6789 on that machine and transmit the contents of /usr/lib/libcrypt.a. (I chose this file because it seems to be a reasonable length. If you want more data to analyze, you could send a longer file.) After the file is transmitted, both nc programs will exit back to the respective shell prompts. That is your sign that it is safe to end the tcpdump.

To stop capturing packets with tcpdump, you can type a control-C. Then insert a DOS-formated floppy into the capturing machine's drive and "mount" it using the command

mount /mnt/floppy

Now you can move your captured data onto the floppy and then unmount the floppy, using the commands

mv /tmp/trace1 /mnt/floppy
umount /mnt/floppy

Be sure to wait until the floppy drive's light goes out before you eject the disk.

Data Analysis

Be sure to leave a copy of your trace files on the floppy, as well as copying them into your home directory for analysis. This is because you will need to submit your floppy along with your lab report, so that I can easily check your work. To copy the files into your home directory, insert the floppy in one of the normal computers, and mount it as indicated above. Copy the file over, for example by using a command such as

cp /mnt/floppy/trace1 .

and then unmount the floppy, again using the command listed above. (The example cp command ends with a space and period, to specify copying into the current directory.)

You can now run tcpdump again, with different command-line options, in order to get a human-readable version of the packet trace. On our normal machines, we don't have tcpdump installed in the standard search path, so you will have to specify the pathname of tcpdump in my MCS-394 directory. A typical command would be

~max/MCS-394/tcpdump -r trace1 >trace1.out

After doing this, you can look at trace1.out, either on the screen or by printing it out. In principle, you don't need any more tools than your eyes and your brain. In practice, you may want to do what the professionals do, and use the computer to help you locate interesting patterns in the data.

There are a variety of general purpose tools that may be helpful in the data analysis. Each of these programs has a man page describing it, and there is also documentation in some of the books in the lab monitor room, such as Linux: The Textbook. You can also ask for help. The below are just some examples, not intended to imply what you will actually want to do. Each of these programs reads from standard input and writes to standard output. You can read from or write to a file by using < or >, and can send the output from one program directly into the input of another program using |. The first example below selects out only lines containing one or more digits, a colon, and then again one or more digits. The next example replaces the string "foo bar" in each line (containing it) with "baz". The third replaces everything from colon to end of line with nothing (i.e., deletes it). The fourth sorts lines in numerical order, assuming they start with numbers. The fifth eliminates all lines that occur only once, without an immediately adjacent duplicate line.

grep '[0-9]+:[0-9]+'
sed 's/foo bar/baz/'
sed 's/:.*//'
sort -n
uniq -D

Whatever tools you choose to use to help you with the analysis, there is one anomaly that you need to ignore. As an artifact of how the nc program is working, there will be a long delay (several seconds) between when the last data is acknowledged and when the FIN packets are exchanged to shut down the connection. If you were to include that delay, the throughput would seem much worse than it really is, through no fault of TCP's. (The nc program simply delays closing the connection.) Therefore, only measure the elapsed time to the last ACK of data. Here are some key items you should be looking for:

How many bytes of data were sent, not including duplicates? How long did it take to send this data? Hence, what was the delivered throughput?
How many timeouts occurred? How long were the pauses these typically introduced? Were there any extra-long pauses due to a segment timing out a second time? What portion of the total elapsed time was spent in these timeout pauses? How high would the throughput have been without them?
What fraction of the original data segments were retransmitted? Which specific ones? Were any repeatedly retransmitted?
For which of the retransmissions does the evidence suggest loss of the original data segment (or of a previous retransmission of the data segment) as the cause? For which ones does a lost ACK seem to be the cause? For which ones are you unable to identify a cause?
What fraction of the total data segments transmitted (both original ones and retransmissions) are apparently being discarded?
For what fraction of the ACK segments do you have evidence of the segment being discarded? This fraction will probably be smaller than for data segments, not because ACK segments are less likely to be discarded, but rather because sometimes they are discarded without any obvious symptoms resulting. Why is this?

Assuming you have time to experiment with the other cluster of three machines, you should be sure to compare them. How does the different loss rate impact the delivered throughput?

Report

Be sure that your report does not assume the reader already knows what you did. You may assume reasonable background knowledge of networking, and should refer to external sources of information (such as RFCs or program documentation) where appropriate. Remember to also submit the floppy with your trace files on it, and be sure that floppy is labeled with your names.

Possible Extensions

There are lots of additional investigations you could do. If you want to do a run with the artificial segment loss turned off, to see how much higher the throughput is, I can easily arrange that for you. I can also turn on selective acknowledgement, if you want to see what difference (if any) that makes. Of course, to get rigorous scientific evidence, you probably should use more than a single relatively short file transmission.

Course web site: http://www.gac.edu/~max/courses/S2002/MCS-394/
Instructor: Max Hailperin <max@gac.edu>