
Our textbook remarks on page 24 that the cost per die goes up more
than linearly with die size because there are not only fewer dies per
wafer (at a roughly fixed cost per wafer) but also a lower yield
(fraction of the dies that work), since there is a roughly fixed
density of defects per square centimeter and so a larger die is more
likely to contain a defect. (A die that contains even a single defect
is considered nonworking; testing at the semiconductor factory is
designed to discover these dies, and they are simply thrown away.) In
this problem, you will examine this effect.
Estimate the ratio of the cost per die of the Pentium Pro processor
to the cost per die of the Pentium processor. The captions of figures
1.15 and 1.16 on pages 23 and 25 specify the number of dies per wafer
at 100% yield and the die areas. One equation on page 48 relates the
cost per die to the number of dies per wafer and the yield (in the
obvious way), while another equation on that page relates the yield to
the density of defects and the die area. (You will not need to use
the approximate equation for dies per wafer given on that page, since
we know the actual number of dies per wafer.) I don't know Intel's
current defect density, but it is probably in the ballpark of 1 defect
per square centimeter, so use that in your calculations. How does the
ratio of costs compare with the ratio of die areas for these two
processors?

Consider two different implementations, M1 and M2, of the
same instruction set. There are three classes of instructions (A, B, and C)
in the instruction set. M1 has a clock rate of 400 MHz and M2 has a clock
rate of 200Mhz. The average number
of cycles for each instruction class on M1 and M2 are given in the
following table:
Class CPI on M1 CPI on M2 C1 usage C2 usage 3rd party usage
A 4 2 30% 30% 50%
B 6 4 50% 20% 30%
C 8 3 20% 50% 20%
The table also contains a summary of how three different compilers use
the instruction set. One compiler is a third party product, C1 is a
compiler produced by the makers of M1 and
C2 is a compiler produced by the makers of M2. Assume that each compiler
uses the same number of instructions for a given program but that the
instruction mix is as described in the figure.
Using C1 how much faster can the company claim M1 is as compared to M2?
Using C2 how much faster can the company claim M2 is as compared to M1?
If you purchase M1 which compiler would you use?
If you purchase M2 which compiler would you use?
Which machine would you purchase assuming all other criteria were
identical including costs?

For the following set of variables, identify all of the
subsets which can be used to calculate execution time.
Each subset should be minimal, i.e.,
not contain any variable which is not needed.
{CPI, clock rate, cycle time, MIPS, number of instructions in program,
number of cycles in program}

Last year's MC48 students discovered in one of their labs the
following facts about a particular program (TeX) processing a
particular input file on a particular processor (the R3000):
Instruction class  Fraction of instructions  CPI for class


loads  .337  1.20

other  .663  1.15

Using techniques that we'll study in chapter 6, it would be possible
to design a new processor (lets call it the S3000) otherwise like the
R3000 but such that each load instruction would be replaced with one or
two instructions with CPI of approximately 1.00. The data from last
year's lab showed that the average load instruction would be replaced
by 1.67 of these new CPI 1.00 instructions. This would increase the
total number of instructions needed to execute the program, but reduce
the average CPI.

What is the average CPI for this program execution on the R3000?
How about on the hypothetical S3000?

Let's use the variable I to designate the number of
instructions it takes to execute TeX on the R3000. What is the number
of instructions that will be necessary on the S3000?

Suppose the clock rates of the S3000 and R3000 are identical. If
the performance of TeX on the two processors is stated in MIPS, which
processor has the higher MIPS rate? How much faster (as measured in
MIPS) is whichever machine is faster?

Suppose that the performance of TeX on the two processors is instead
measured by the total execution time for the program. Now which
processor is faster, and by how much?