CISC 372: Parallel Programming
Fall 2001
Individual Lab Assignment 1

First deliverable due at start of class, Friday, Sept 14, 2001
Second deliverable due at start of class, Friday, Sept 21, 2001

Objectives

The objective of this assignment is to become familiar with the use of MPI on the University of Delaware Alpha cluster, to learn how to time MPI programs, and to learn how to write experimental reports.

Procedure

1.
Read through Sections 1 and 2 of the Beginner's Guide to MPI on the University of Delaware DEC Alpha Cluster. Make sure that you can compile and run the trap program with several different numbers of processors, and understand how this works. The program can be found in ~saunders/trap.c/ on porsche.
2.
Modify trap so that it works correctly with any number of processors, p, not just when p divides n, the number of trapezoids, evenly. The first deliverable is to submit this version of trap. Include 10 runs, on 1, 2, 3, 4, 5, 6, 7, 10, 12, and 20 processes.
3.
The remainder of this assignment concerns the second (main) deliverable. Modify trap in the following ways:
  1. Uses double instead of float. Learn a way in emacs or vi to change all instances of "float" to "double" in one command. Also change the output to print more digits of the final integral.
  2. Takes the number n of trapezoids as a command line parameter rather than as literal data in the code.
  3. Link with a separately compiled file containing
    1. the function named f(x) to to be integrated.
    2. The low endpoint, a, of the integration.
    3. The high endpoint, b, of the integration.
    Declare these two variables "extern double", and don't assign a value to them. Delete the definition of f(x) at the end of trap.c, but don't delete the declaration of it's signature within your trap function. The separately compiled file will define them.

    There will be three files you can use with your trap code.

    1. f1.o contains a function f(x) which is just the constant 1.0. The interval of integration is 0 to 10, so the expected integral is 10.
    2. f2.o contains the squaring f(x) that was defined at the bottom of trap.c (but uses double now), and a=0.0 and b=1.0 exactly as in the original example. The expected integral is 1/3.
    3. f3.o contains a mystery function and mystery interval of integration. We don't know the expected value of the integral, but we expect the same value for the integral with any number of processes.
    To compile, for example with f2: "mpicc my_trap.c f2.o -o my_trap"
  4. Lastly, and most importantly , insert calls to MPI_Wtime() so that the process with rank 0 gets a start time just after learning it's rank and an end time just before printing the answer. It then prints the elapsed time along with the answer.
4.
  1. Now take performance runs using f3 for the function to integrate and using each of 1 thru 15 processes.
  2. Create a graph that displays all of your timings. The horizontal axis is the number of processes and the vertical axis is the elapesed time. For this assignment, you can either create the graph by hand or on the computer using a tool that automatically creates the graph based on the data inputs. For later assignments, you need to use a computer to generate the graphs. Two different approaches to generating these graphs from output of your MPI programs will be given on the course web site.
  • Create a second graph that displays the speedups of your timings. The horizontal axis is the number of processors, the vertical axis is the speedup.
  • Create a third graph that displays the parallel efficency of your timings. The horizontal axis is the number of processors, the vertical axis is the efficiency.

    Hints of Getting Good Timing Data

    In order to get reliable timing data, you need to make sure that no one else is using the cluster. To do this, you should run the /usa/saunders/bin/allps command just prior to making a performance run. Also, from past experience, leaving the timings until last evenings before the assignment due date makes getting reliable timing runs very difficult to obtain due to the many other folks trying to get access to the cluster for timings.
  • Experimental Report

    Your experimental report should consist of the following sections in this order. It is strongly recommended that you type your report using a word processor rather than handing in a hand-written report.
    1.
    Cover Page: Title, author, course number and semester, date.
    2.
    Project Summary: In one paragraph, summarize the activities of the lab.
    3.
    Data:
    1. A chart of the 15 raw collected timings. You may average the results of two runs. If for some number of processors you get what you consider an anomolous timing, and you get two other closer timings for that number of processors, you may average the two closer ones. However document these actions. In the spirit of science, we don't just ignore data we don't like. Explain how the timing numbers you actually use for the chart and graphs are obtained and mention other timings you rejected with your rationale for rejecting them. Note it will probably occur that you intersperse timing runs with code modification. For the report, you can ignore without comment the timings you obtained with preliminary versions of the code. Indeed, you must do this. It is crucial that all timings discussed in the report concern the final version of your code.
    2. the timings graph.
    3. Speedup graph.
    4. Efficiency graph.
    (Clearly label each graph.) Remark: Each graph depicts the same information . The only difference, and one to think about, is how well the information is taken in by the human reader.
    4.
    Analysis of parallel trap: Give explanations of your data. Explain what you think causes the difference between what you get and perfect linear speedup (i.e. perfect 100% parallel efficiency). Also, importantly, discuss the variation in the parallel efficiency. Does it uniformly decay as number of processes increases or not? why? What happens when the number of processes exceeds the number of processors?
    5.
    Conclusions: This section consists of a discussion of what you learned from the various aspects of the lab. Discuss possible reasons for inconsistencies or discrepancies in your data versus what you would have expected to happen.
    6.
    Appendix: Your trap.c code.

    Please staple all parts of your lab together, and label each piece. Be prepared to discuss your results on the day that the assignment is due.

    Criteria for Evaluation

    Your lab will be evaluated according to the following criteria:
    1. 25 pts: First deliverable.
    Experimental Report (second deliverable):
    1. 5 pts: cover page.
    2. 10 pts: project summary.
    3. 20 pts: chart of timings, discussion, three graphs.
    4. 20 pts: analysis.
    5. 10 pts: conclusions.
    6. 10 pts: appendix: code modifications (other than for first deliverable).