MC78 Lab 3: A File System (Fall 1996)
Due: November 12, 1996
Goals of the lab
In this lab your team will read and understand a primitive file system
implementation and make the following improvements to it:
-
Add synchronization where necessary so that multiple threads can
safely use the file system concurrently. (As provided to you, the file
system assumes it is only being used by one thread.)
-
The filesystem's data structures must be protected against damage from
races.
-
You must also ensure that threads can share a file. Each thread that
separately opens the file should have its own seek position within the
file, so that two threads can independently sequentially read the same
file. All file system operations must be atomic and serializable.
For example, it must never happen that a read observes half the data
written in a single Write as having been written but the other half
not. Moreover, if a Write finishes and then a Read starts, the Read
must observe the effect of the Write.
-
When a file is deleted, threads with the file already open may
continue to read and write the file until they close the file.
Deleting a file (
FileSystem::Remove
) must prevent further
opens on that file, but the disk blocks for the file cannot be
reclaimed until the file has been closed by all threads that currently
have the file open.
Hint: to do this part, you will probably find you need to maintain
a table of open files.
-
Modify the file system to allow the maximum size of a file to be as large
as the disk (128Kbytes). In the basic file system, each file is limited
to a file size of just under 4Kbytes. Each file has a header
(class FileHeader) that is a table of direct pointers to the disk blocks
for that file. Since the header is stored in one disk sector, the
maximum size of a file is limited by the number of pointers that will
fit in one disk sector. Increasing the limit to 128KBytes will probably
but not necessarily require you to implement doubly indirect blocks.
-
Implement extensible files. In the basic file system, the file
size is specified when the file is created. One advantage of this
is that the FileHeader data structure, once created, never changes.
In UNIX and most other file systems, a file is initially created
with size 0 and is then expanded every time a write is made off the
end of the file. Modify the file system to allow this; as one test
case, allow the directory file to expand beyond its current limit
of ten files. In doing this part, be careful that concurrent
accesses to the file header remain properly synchronized.
Origin of this lab
This code and assignment for this lab come from the Nachos simulated
operating system developed by Tom Anderson at Berkeley for educational
use.
The files
The files you will need for this lab are in the directory
~max/www-docs/MC78/lab3/code
. The simplest thing for you
to do is to make a copy of the whole directory by in a shell window doing
the following command:
cp -pr ~max/www-docs/MC78/lab3/code .
This will give you your own code
directory as a
subdirectory of whatever directory you were in when you did the
cp
command. The relevant files are all in the filesys
subdirectory. The most interesting are the following (you have
printouts of these all already):
-
fstest.cc
- this
contains the code to test out the filesystem, and as such exemplifies
its use
-
filesys.h
and
filesys.cc
- this
is the main interface to the filesystem
-
directory.h
and
directory.cc
- this manages directory files that handle the mapping of file names
to file header locations
-
filehdr.h
and
filehdr.cc
- this is essential the ``inode'' for the Nachos filesystem, a disk
block showing which disk blocks the file occupies
-
openfile.h
and
openfile.cc
- this is the class that you get when you open a file, and on which
you then perform the read and write operations; it maps those file
reads and writes into appropriate disk sector reads and writes
-
synchdisk.h
- this is the interface below the filesystem, providing synchronous
access to a simulated disk (represented by the Unix file
DISK
); synchronous access means that rather than starting
a disk access going and then being interrupted when it completes, the
thread blocks until the access is done
Compiling and testing the program
Since the given code already implements a marginally working
filesystem, all you need to do is to replace
code/threads/synch.h
and
code/threads/synch.cc
with the ones from your prior lab
and then you can cd
to code/filesys
and do
the command make
. This will recompile/assemble/link
everything, resulting in a program called nachos
.
To try the program out, here are some examples of what you can do:
nachos -f
nachos -cp test/small foo
nachos -p foo
These format the ``disk,'' copy the Unix file test/small
into the Nachos file foo
, and then print that Nachos file
out. (See fstest.cc
for how the copying and printing is
done.)
You could also use the -t
option to
nachos
to run the performance test, but you'll either
have to complete the assignment first or (temporarily) modify the
performance test to copy less data and pre-allocate the file the right
size. That's because the test as written copies more data than the
maximum file size (until you increase that maximum) and allocates the
file with size 0 on the assumption that it will automatically grow,
which it won't until you add that feature.
Debugging output
To produce debugging output from your code, you can insert
DEBUG
lines as in the prior lab. The only thing new is
that for the debugging output from the filesystem module, the letter
f
is used instead of t
.
What to turn in
Turn in a jointly-authored lab report containing your changes to the
files you needed to change. You should also describe briefly the
logic behind your program.
Possible extensions
There are a lot of other directions you can go with modifying the file
system if you want to do more. You can introduce hierarchical
directories and pathnames. You can introduce caching of disk blocks
and measure the reduction in disk traffic, or strategically place
header blocks and other blocks and measure the reduction in average
seek distance. You can make changes such that the filesystem will
remain consistent even if the system crashes mid-operation, or write
an ``fsck'' style program to repair inconsistencies post facto.
Instructor: Max Hailperin