MCS-378 Lab 1: Scheduling Experiments, Fall 2002

Due: September 23, 2002

Introduction

In this lab, your group will experiment with programs that load the linux system up with processor and/or filesystem activity. Be sure that you kill these processes off before logging out, as you will otherwise be quite unpopular with whoever uses the machine next, since these processes are designed precisely to be resource hogs. If you are running one of these programs in the foreground in a shell window (as I suggest), you can just use control-C to kill it. Otherwise you can use the kill command. The reason to run the programs in the foreground, each in their own shell window, is that they print periodic reports of how long it took to do the last batch of iterations.

The two programs you will use are runner.c and writer.c; each is linked into the web version of this assignment. One does lots of running and the other does lots of writing to a file. To compile them you would use the commands

cc -o runner runner.c
cc -o writer writer.c
To run the runner (for example) you would use
./runner

There is almost zero programming to do in this lab and the experimental procedure is relatively simple and is spelled out for you, so the key is going to be for your group to write a good lab report that reports what you observed and provides some interpretation of those observations.

Characterizing Your Experimental Environment

You are conducting a scientific experiment and writing a scientific report. As such, it is important that you report not only the data you obtain (and your interpretation of it), but also the conditions under which that data was obtained. This allows readers to more fully understand your results, and it also provides the necessary information for anyone who wants to replicate or extend your experiment.

You should take all your data on a single machine, since the computers in our lab differ from one another. You should also try to be sure little else is running on the computer; use the "top" command and check that nothing is getting any appreciable percentage of the CPU time.

Your report should specify the exact version of linux you used, and a reasonable level of detail about the hardware: processor model name and clock speed, cache size (level 2), memory size, motherboard chip set, and disk drive model. You can find these pieces of information in /proc/version, /proc/cpuinfo, /proc/meminfo, /proc/pci, and /proc/ide/hda/model. Ask if you need help locating the appropriate information.

Experimental Procedure

  1. Run one, two, three, and four copies of runner and observe how long they report they are taking (as a function of how many there are). Also, in yet another shell window, run the top program and see what percentage of the CPU each runner process is getting (again, as a function of how many there are).
  2. Run two copies of runner and have a partner log into the same machine and run one additional copy of runner. (Thus you will have three copies running, two under one user and one under another.) Again, observe their reported times and top's report of their CPU percentages. Is linux's scheduler giving a fair share of the CPU to each user or to each process? Are there particular application settings in which this design decision would seem particularly appropriate or inappropriate?
  3. Run one copy of writer and observe its reported times, top's report of its CPU percentage, and vmstat's report of the blocks written out per second. (To run vmstat, in another shell window give the command
    vmstat 10
    
    The first line of statistics is since the machine was booted, and isn't useful. However, thereafter a new line will be output every 10 seconds (since you specified 10) reporting on activity in the preceding 10 seconds. The column headed "bo" is the one showing blocks written out per second.) Is the writer doing lots of actual writing to disk? (You can also look at the machine's disk light and listen for disk sounds.)
  4. Follow up on the preceding question by reading the man page for the fsync system call, which ensures data written to a file descriptor is actually written to disk. (An alternative to the man page would be to look it up in Stevens's book, which we have in the monitor room.) Insert an appropriate call to fsync in the inner loop body of writer.c so that it is forced to go to disk every time it writes a character. Redo the observations from the preceding question. You may need to reduce the number of iterations that are timed. Are the results quite different?
  5. Start one runner going, and then in another shell window start one of your modified writers and observe their performance reports and the statistics from top and vmstat. How does each program's performance when they are run together compare with that when run alone? Does this tell you anything about the linux scheduler, or about the potential utility of running multiple programs at the same time?


Instructor: Max Hailperin