MCS-378 Lab 3: Prepaging Performance, Fall 2007

Due: November 2, 2007

In this lab, your team should carry out an experimental project that is essentially Programming Project 6.2 from the textbook, though the list of items below specifies some variations from what is in the book. You should then write a report that explains what you did (including describing your experimental computer system), the results you observed, and your analysis of those results.

Contrary to the book, don't bother experimenting with programs that access a memory-mapped file in random order. Although that can be interesting, you'll have enough to do just doing a high-quality job analyzing the performance of programs that read sequentialy.
Rather than only experimenting with two madvise settings, sequential and random, you should also include a third option: the "normal" style of prepaging (which you can explicitly set, though it is also the default if you don't use madvise at all).
Given that you will only be experimenting with a single kind of program behavior (sequential access), the only variation will be in the madvise setting. As such, you might want to consider the option of having a single, unified, program that takes an argument specifying the setting to use. If you'd rather go ahead and follow the book's suggestion of using separate (but nearly identical) programs, that's OK too.
There is no need for you to write a program from scratch, if you don't want to. Instead, you can get a big headstart on the programming by using the cpmm.cpp program from Chapter 8 as a starting point.
The book says to reboot the computer before each run. Although that would be ideal, it slows down the experimentation a lot. Given that you should do a bunch of runs with each of the three paging modes, and should still have time after the experimentation to do some substantial analysis and writing, you will be glad to know that a shortcut is possible. In my experience, as long as the file you are reading is large (for example, 1 GB), repeated runs without rebooting don't behave very differently from the first run. That is, the retention of pages from prior runs in memory is small enough relative to the file size to not make much difference. However, if you are doing multiple runs without rebooting, then it becomes even more important than usual to scramble together the repetitions of the three different paging modes. If you do all the runs of one, then all the runs of another, and finally all the runs of the third, you're not treating the three fairly and may have introduced a confounding factor in your results.
To produce a big file to use for your testing, you can use a command like
```
dd if=/dev/zero of=/tmp/bigfile bs=1M count=1K
```
This copies bytes from one file to another. The input file is /dev/zero, an endless source of bytes that each are 0. The output file is /tmp/bigfile. Be sure you use a pathname in /tmp rather than in your home directory; that way the file will be on the local hard disk rather than on the network file server. The block size is 1M; that is, the zero bytes will be read in and written back out in chunks of 1 MB at a time. The count of 1K means that 1024 of the 1 MB blocks are written. This would total 1 GB.
Don't just measure the total elapsed time for each run. Instead, record more detailed information: the user and system CPU times, the elapsed time, and the number of major and minor page faults. You can get all this information by prefacing the command you want to time with \time. Be sure to include the backslash; otherwise you will be using the shell's built-in timing command, which doesn't provide as complete information.
In analyzing your data for your report, here are some questions you should address:
1. How big a difference does prepaging make? Be sure you put the size of any differences you measure into the context of how much run-to-run variation you observe.
2. Is the normal mode more like sequential or more like random?
3. If the prepaging mode affects the elapsed time, can you tell where the difference is coming from by looking at how much of the elapsed time is coming from user mode CPU time, system CPU time, and the rest of the time, which is presumably mostly time spent waiting for disk? If a combination of factors is at play, be sure to provide specific information.
4. How well does the number of major page faults correlate with each of the various measures of time?
5. Does the sum of major and minor page faults remain approximately constant, as one might expect?
6. Within the CPU time, it is easy to see why the system time (spent in the operating system kernel) might vary. But any variation in the user time (spent running instructions from the application program itself) is harder to explain. Perhaps any variation in user time is just a misattribution error; the operating system may not be precisely accounting for how much of the time is spent in each mode. Is there any evidence to support this idea?

Instructor: Max Hailperin