When turning in a homework problem, be sure to indicate the exercise number. These will be the reference numbers I use in reporting back your standing on the homework.
Exercise A.x1: Throughout this problem, assume that only one matrix multiplication can be underway at a time.
By reading the graph in Figure A.7.3 (page A-52) as closely as you can, estimate the GFLOPS rates for 64×64 and 128×128 matrices using the GeForce 8800 GTX GPU to do the matrix multiplication.
Similarly estimate the GFLOPS rates for each of these two matrix sizes using the Core 2 Quad CPU to do the matrix multiplication.
If the workload consists entirely of multiplications of 64×64 matrices, which processor would you use? Approximately how many times faster would it be than the other one?
If the workload consists entirely of multiplications of 128×128 matrices, which processor would you use? Approximately how many times faster would it be than the other one?
Exercise A.x2: This is a continuation of the prior problem. Again, throughout this problem, assume that only one matrix multiplication can be underway at a time. Now suppose the workload consists a mix of the two sizes of matrix multiplications. Each size is done equally often. (For example, a program might do 100 multiplications of 64×64 matrices and 100 multiplications of 128×128 matrices.) Keep in mind that the larger matrices require eight times as many floating point operations to multiply.
What is the overall GFLOPS rate for each processor on this mixed workload?
Which processor is faster for the mixed workload and how many times faster is it than the other?
If you are processing the mixed workload on a system that includes both processors, you could also do the small multiplications on the CPU and the large multiplications on the GPU. What is the overall GFLOPS rate that this strategy would achieve?
Instructor: Max Hailperin