For this entire lab, we will concentrate exclusively on Linux's normal
scheduling algorithm, used for processes with the scheduling policy
SCHED_OTHER
. (All normal processes use this policy.) We
will ignore the real-time scheduling policies SCHED_FIFO
and SCHED_RR
.
The scheduler has two primary objectives:
To observe the scheduler's behavior, you will need two test programs
to run. One is just a do-nothing infinite loop, to provide a
CPU-bound process. The other is a loop that uses two operating system
calls. One is nanosleep
, which asks the operating system
to put the current process to sleep for a specified number of seconds
and nanoseconds. You will use values large enough to sleep for
several seconds. This is done each time around the loop, with
essentially nothing else. Thus the process is mostly asleep (not
runnable), waking up (running) only very briefly every few seconds.
The other system call the program uses, gettimeofday
,
does what it says. (You should be familiar with it from the prior
lab.) The program uses gettimeofday
twice each time
around the loop - once before and once after the
nanosleep
. By subtraction it computes the extra delay
(latency) beyond the requested sleep time. The program then prints
out this latency for you to see. The programs from the prior lab
should help you write the test programs you need for this lab, but if
you have trouble, please ask: this is not intended to be the focus of
the lab, so you shouldn't get stuck here. To read the on-line
documentation of the system calls, you can use a command like
man nanosleep
To run one of the programs at a specified "niceness" (priority) level,
you can use the system program called nice
. (See the
on-line documentation.) For example, to run a program called
./runner
at approximately half the normal priority, you
would do
nice -10 ./runner
Do some experiments as in the previous lab, using top
, to
see how multiple CPU-bound processes share the CPU when they have
different priority levels (as well as when they have the same
priority). When you change the scheduler, verify that this CPU
sharing behavior remains essentially unchanged.
To experiment with responsiveness, you should first provide the sleeping process with some competition for the CPU by running one or more CPU-bound processes. You can choose how many and of what priority - this is part of your experimental design, and should be chosen to provoke the results you want. Now run one copy of the repeatedly sleeping program (again, at a priority level of your choice). Observe the latencies it encounters in waking up from its sleeps. You should repeat this several times under "identical" circumstances (same competing processes, same version of the scheduler), since it may matter when in the scheduling cycle the process starts.
Now modify the scheduler and, using the modified kernel, repeat the experiments. See below for more on how to make a modified kernel. To get full credit for this lab assignment, you need to do the following - in either order:
counter
when
counter
values are recalculated. Demonstrate a
circumstance under which this makes a significant difference in
latency, and explain why.
I will issue each team four diskettes. One of these diskettes is write protected, has "2.4.0-test7 orig. boot" written on it, and contains the Linux 2.4.0-test7 kernel, suitable for booting any of our course's PCs. (You put the floppy in the drive and turn the machine on.) You should leave this diskette write-protected. A second diskette for each team has written on it (and is) "MS-DOS formatted," and may contain some uninteresting Windows setup files (from OmniTech), which you can delete. You can put one of these disks in an already booted Linux machine, or equally well in one of the normal lab machines, and mount it using the command
mount /mnt/floppyAt this point, the diskette is available to you as the /mnt/floppy directory. When you are done reading or writing it, before you physically eject the diskette you should give the command
umount /mnt/floppyYou will need to mount this MS-DOS diskette in order to transport files back and forth between our experimental machines and the normal lab machines. You will need to do this in order to print files, maintain safe backup copies, etc. (Don't treat the hard drive of your experimental machine as a safe place to leave your work.) The experimental machines are not networked. The final two diskettes for each team are for you to make experimental, modified, kernel boot disks as described below.
/usr/src/linux/kernel/sched.c
, which is the scheduler.
(That annotated copy is linked here as PostScript and as text.)
Read it with an eye to the specific de-tuning objectives listed above.
Your changes should be very small: just a handful of characters. This
is mostly a read-and-understand lab (with experimental testing of your
understanding, possibly resulting in revised understanding).
Remember when reading the code that we are only concerned with
the scheduling policy SCHED_OTHER
. Also, you can ignore
all code that is specifically for SMP (symmetric multi-processor)
systems. In the annotated version of the code, I've indicated how to
tell which code is for SMP systems.
Note that the normal networked lab machines are running an older version of the Linux kernel than we are using on our experimental machines. (The 2.4.0-test7 version is "bleeding edge.") Thus you will need to do your editing starting from the sched.c that is on the experimental machine, not the version on the normal networked lab machines.
To rebuild the kernel after editing the sched.c
file,
put a blank
disk in the floppy drive and, with your current directory being
/usr/src/linux
, give the command
make bzdiskTo boot your new kernel, leave that disk in the drive and give the command
rebootRemember that kernel hacking is difficult, and debugging especially so. So don't feel bad if your modifications don't work at first. Be sure to consult with me.
Instructor: Max Hailperin