MCS-378 Lab 1: Scheduling Experiments, Fall 2002
Due: September 23, 2002
Introduction
In this lab, your group will experiment with programs that load the
linux system up with processor and/or filesystem activity. Be sure
that you kill these processes off before logging out, as you will
otherwise be quite unpopular with whoever uses the machine next, since
these processes are designed precisely to be resource hogs. If you
are running one of these programs in the foreground in a shell window
(as I suggest), you can just use control-C to kill it. Otherwise you
can use the kill command. The reason to run the programs in the
foreground, each in their own shell window, is that they print
periodic reports of how long it took to do the last batch of
iterations.
The two programs you will use are runner.c and
writer.c; each is linked into the web version
of this assignment. One does lots of running and the other does lots
of writing to a file. To compile them you would use the commands
cc -o runner runner.c
cc -o writer writer.c
To run the runner (for example) you would use
./runner
There is almost zero programming to do in this lab and the
experimental procedure is relatively simple and is spelled out for
you, so the key is going to be for your group to write a good lab
report that reports what you observed and provides some interpretation
of those observations.
Characterizing Your Experimental Environment
You are conducting a scientific experiment and writing a scientific
report. As such, it is important that you report not only the data
you obtain (and your interpretation of it), but also the conditions
under which that data was obtained. This allows readers to more fully
understand your results, and it also provides the necessary
information for anyone who wants to replicate or extend your
experiment.
You should take all your data on a single machine, since the computers
in our lab differ from one another. You should also try to be sure
little else is running on the computer; use the "top" command and
check that nothing is getting any appreciable percentage of the CPU
time.
Your report should specify the exact version of linux you used, and a
reasonable level of detail about the hardware: processor model name
and clock speed, cache size (level 2), memory size, motherboard chip
set, and disk drive model. You can find these pieces of information
in /proc/version, /proc/cpuinfo, /proc/meminfo, /proc/pci, and
/proc/ide/hda/model. Ask if you need help locating the appropriate
information.
Experimental Procedure
-
Run one, two, three, and four copies of runner and observe how long
they report they are taking (as a function of how many there are).
Also, in yet another shell window, run the top program and see what
percentage of the CPU each runner process is getting (again, as a
function of how many there are).
-
Run two copies of runner and have a partner log into the same machine
and run one additional copy of runner. (Thus you will have three
copies running, two under one user and one under another.) Again,
observe their reported times and top's report of their CPU
percentages. Is linux's scheduler giving a fair share of the CPU to
each user or to each process? Are there particular application
settings in which this design decision would seem particularly
appropriate or inappropriate?
-
Run one copy of writer and observe its reported times, top's report of
its CPU percentage, and vmstat's report of the blocks written out per second.
(To run vmstat, in another shell window give the command
vmstat 10
The first line of statistics is since the machine was booted, and
isn't useful. However, thereafter a new line will be output every 10
seconds (since you specified 10) reporting on activity in the
preceding 10 seconds. The column headed "bo" is the one showing
blocks written out per second.) Is the writer doing lots of actual writing to
disk? (You can also look at the machine's disk light and listen for
disk sounds.)
-
Follow up on the preceding question by reading the man page for the
fsync
system call, which ensures data written to a file
descriptor is actually written to disk.
(An alternative to the man page would be to look it up in Stevens's
book, which we have in the monitor room.)
Insert an appropriate call to
fsync
in the inner loop body of writer.c so that it is
forced to go to disk every time it writes a character. Redo the
observations from the preceding question. You may need to reduce the
number of iterations that are timed. Are the results quite different?
-
Start one runner going, and then in another shell window start one of
your modified writers and observe their
performance reports and the statistics from top and vmstat. How does
each program's performance when they are run together compare with
that when run alone? Does this tell you anything about the linux
scheduler, or about the potential utility of running multiple programs
at the same time?
Instructor: Max Hailperin