CISC 372: Parallel Programming
Fall 2001
Individual Lab Assignment 2

Due at 5pm Monday, Nov 12, 2001

Objectives

The objectives of this assignment are

Procedure

1.
Read Chapter 7 of the text, PPMPI. Make sure that you can compile and run the fox.c program with 1 and with 4 processors. It can be found on porsche in ~saunders/ppmpi/chap07/fox.c .

Fox.c reads in two matrices A and B and computes their product C = A*B. We will modify fox.c to obtain an efficient program that computes the square of the Hilbert matrix, C = H*H. For any given order n, The n by n Hilbert matrix has 1/(i+j+1) in it's i,j position for i in [0..n) and j in [0..n).

2.
Modify fox.c so that
  1. Take the matrix order n from the command line instead of stdin. (this change is optional).
  2. Instead of a root process reading in A and B and sending the blocks to the processes, each process creates the entries in it's block of H from appropriate use of the formula 1/(i + j + 1). Note: Each block does such initialization of a block of H once. Thereafter the algorithm must proceed without using specific knowledge that the blocks are those of a Hilbert matrix. So it is illegal to just create another block rather than getting it by communication from the appropriate process!
  3. In main() have process zero time the call to function fox() using MPI_WTime(). The main purpose of this lab is to determine how parallel efficiency is affected when 4 processors are used but various sizes of matrices are multiplied.
  4. To limit memory problems, get rid of the global variable temp_mat. Also be sure no process has more than 4 matrix blocks allocated at any one time.
  5. To avoid printing huge matrices, modify Print_matrix so that just the first entry in each block is printed and the sum of all the entries is printed. Thus if p=4 processes are used and n=1000, then 5 numbers are printed: C[0,0], C[0,499], C[499,0], C[499,499], and S, where S is the sum of the 1000000 entries of C.
3.
  1. Now take performance runs using p = 1 and p = 4 and a range of values of n which are powers of 2 and lead to run times up to a minute or two.
  2. Create a graph that displays all of your timings. The horizontal axis is n and the vertical axis is the elapsed time. Two curves are shown on this graph, for p = 1 and for p = 4.
  3. Create a second graph that displays the parallel efficency of your timings. The horizontal axis is n the vertical axis is the parallel efficiency: T_1(n)/4*T_4(n).

Experimental Report

Your experimental report should consist of the following sections in this order. It is strongly recommended that you type your report using a word processor rather than handing in a hand-written report.
1.
Cover Page: Title, author, course number and semester, date.
2.
Project Summary: In one paragraph, summarize the activities of the lab.
3.
Data:
  1. timings chart
  2. timings graph.
  3. Efficiency graph.
(Clearly label each graph.) Remark: Each graph depicts the same information . The only difference is how well the information is taken in by the human reader.
4.
Analysis: Give explanations of your data. Explain what you think causes the difference between what you get and perfect speedup (i.e. perfect 100% parallel efficiency). Also, importantly, discuss the variation in the parallel efficiency. Does it decay or improve as n gets larger? why?
5.
Conclusions: This section consists of a discussion of what you learned from the various aspects of the lab. Discuss possible reasons for inconsistencies or discrepancies in your data versus what you expected beforehand.
6.
Appendix: Your modified fox.c code. NOTE: Modify the header comment (Input, Output, Notes) as appropriate. Include your name and the date, but retain reference to the original form. More generally the code in fox.c is not well commented. Each function should have a comment saying what it does. It is not even stated whether the inputs local_A and local_B to function fox() can be the same or not. May they be the same (in the version you end up with)?

Please staple all parts of your lab together, and label each piece. Be prepared to discuss your results on the day that the assignment is due.

Criteria for Evaluation

Your lab will be evaluated according to the following criteria:
  1. 5 pts: cover page.
  2. 10 pts: project summary.
  3. 20 pts: chart of timings, discussion, two graphs.
  4. 20 pts: analysis.
  5. 10 pts: conclusions.
  6. 20 pts: appendix: code (modifications made, memory use, design issues)
  7. 15 pts: appendix: code internal documentation (comments).