MCS-378 Lab 3: Filesystem Locality, Fall 2002

Due: November 20, 2002

In this lab, your team will experimentally explore the performance implications of the ext2 filesystem's policy for choosing the location on disk of newly allocated inodes (files).

In general, it will improve performance if inodes that are commonly accessed in close succession are located near one another on disk. Because it is common to access multiple files in the same directory one after another, and in close proximity to accessing the directory itself, the ext2 filesystem normally tries to place inodes nearby to the parent directory's inode. In particular, it tries to place the new inode in the same block group as the parent directory's inode.

However, if this policy were always followed, even for (sub)directory inodes, it would try to bunch all the inodes in the whole filesystem in a single block group, which clearly isn't possible if the whole disk (consisting of multiple block groups) is going to be used. Some countervailing policy is clearly needed to spread the filesystem across the disk.

Therefore, the ext2 filesystem uses a different policy for placing a new inode if the new inode is for a (sub)directory. This alternative policy puts considerably greater emphasis on even spreading across the disk than on locality. (Recent research suggests that this was carried too far, and so this policy is being changed in the new version of Linux under development.)

These policies are implemented in the ext2_new_inode procedure in the file /usr/src/linux-2.4/fs/ext2/ialloc.c. In particular, there is a conditional statement near the top of the procedure controlled by

if (S_ISDIR(mode)) {

The two branches of this conditional statement provide the two policies. You should read this part of the procedure to understand the two policies. (The rest of the procedure concerns other mechanical details that aren't relevant to this lab.) Note that a large chunk of the first branch of this conditional statement is commented out. Also, there are lots of grungy details that I don't want to stand between you and understanding the overall policies; be sure to ask me for help. In particular, the procedures le16_to_cpu and le32_to_cpu just return unchanged the number they are given; you can pretend they aren't even there. (They exist for the sake of running Linux on other processor architectures; the goal is to always store the bytes on disk in the same order, even though processors differ on the order they use internally.)

There is almost zero programming to do in this lab and the basic experimental procedure is relatively simple and is spelled out for you (though you are invited to expand on it), so the key is going to be for your group to write a good lab report that reports what you observed and provides some interpretation of those observations.

The comment at the top of the procedure is apparently out of date, and doesn't correctly document the actual policies. Rewrite it to be correct.
Make a variant kernel (as in the prior lab) in which the policy normally used only for allocating directory inodes is instead always used. You should also make a normal, unmodified, kernel onto a boot floppy, so that you can do experimental comparisons between the two and be sure there are no other differences. (Booting off the hard disk should have no other differences, but it is best to be sure.) Each group will need three floppies: two for the kernels and one (MS-DOS formatted) for transferring files. If I get enough of the floppies from the last lab back, I'll supply them to you for this lab. Otherwise you may need to provide your own.
Now your goal is to compare the two kernels, both for performance (speed) and also to see directly that the inode placement is in fact different. The basic experimental procedure is as follows:
1. Boot one of your kernels and log in, doing nothing extraneous. You may need to wait a while for background activity to subside, particularly if your computer has been shutdown overnight.
2. Time the benchmark by doing the command
```
time sh -c 'tar Ccf /usr - lib | tar Cxf /tmp -'
```
  This copies the directory tree /usr/lib to /tmp/lib, by using the tar program twice: once to pack up all the files in the original directory, and then again to unpack them into the new directory. You should look for the elapsed (real) time reported by the time command. Looking at the CPU time as well (particularly the system part) will give you some indication whether any change in the elapsed time is coming from the actual execution of the allocation policy taking longer, or whether it is coming from increased disk access time.
3. Now measure the inode placement locality of the normal files and directories in the /tmp/lib directory tree. A separate web page explains how to do this.
4. Record all your experimental findings somewhere other than on the experimental computer's hard disk, remove all temporary data files created in the course of the previous step (measuring placement locality), and then remove the /tmp/lib directory tree itself. That way the filesystem should be back the way it started.
5. Now you can reboot again, using the same kernel or a different one, and start the procedure over from the top. You should do multiple trials with each of the kernels, ideally in a randomized order.
At this point, you have experimented with the impact on one simple benchmark of one simple change in filesystem policy. To more fully characterize how inode allocation affects performance, it would be desirable to try other benchmarks and/or other policy changes. Feel free to do any extensions of this kind, as time permits.

Instructor: Max Hailperin