CISC 367: INTRODUCTION TO PARALLEL PROGRAMMING
Spring 1998
Programming Assignment 2
Due Date: start of class, Tuesday, March 10, 1998

Objectives

The objective of this assignment is to write some simple MPI programs to learn how to accomplish communicaton between processes, and to time MPI programs.

Procedure

  1. Read through Section 3 of the Beginner's Guide to MPI on the University of Delaware DEC Alpha Cluster, and chapter 3 of your textbook.

  2. Centralized Summation: Write an MPI program that computes the global sum of the data local to each process in the following way. First, for ease of creating local data, use the rank of each process as its local data, so you are basically computing the sum of the ranks. Assuming that there are N processes, the global sum should be computed by having processes 1 through N-1 send their rank to process 0, and having process 0 receive their ranks and compute the global sum itself. The processes should not have to send their ranks in any particular order, and process 0 should have not have to receive them in any particular order. That is, process 0 should not explicitly try to receive the ranks in order 1 through N-1. This causes undue synchronization. Your centralized summation program should be written in a general manner in order to work for any number of processes N > 1, without recompilation of the program code. You need to time from just before when the processes start sending data until just before process 0 prints the result. Do not time the printing. Take 3 timings and average the timings for 4 processes, and then for 8 processes.

  3. Shifting Summation: Write an MPI program that computes the global sum of the data local to each process in the following way. Again, assume that the rank of the process is the local data. Assuming that there are N processes, process N-1 should send its rank to the next lower numbered process N-2, which adds its rank to the running sum (i.e., we call this a partial sum), and sends the current partial sum to the next lower ranked process. This continues until process 0 receives the partial sum from process 1. Process 0 then adds its rank (I know, it is zero, but we are mimicking summation of local data), and prints out the final global sum. You need to time from the start of process N-1 sending its data, until just before process 0 prints. Take 3 timings and average the timings for 4 processes, and then for 8 processes. Again, your code should be written in a general manner, able to work for any N > 1 number of processes without recompilation.

  4. Tree Summation: Write an MPI program that computes the global sum of the data local to each process in the following way. Again, assume that the local data is the rank of the process. Assume that there is a number of processes, N > 1, that is a power of 2 (i.e., 2, 4, 8). The communication and summation should be performed in a tree-like pattern. The following diagram shows the communication for 8 processes, where the numbers indicate the process that is doing the summation at that point in the tree computation. So, process 1 sends its rank to process 0, which computes a partial sum, and waits for process 2 to send its partial sum of 2 and 3. Then, process 0 computes the partial sum of its current sum with the partial sum from process 2. Then, process 0 waits for the partial sum from process 4, and computes the final global sum. Each interior node in this logicallogical tree is a partial sum computation. The original local data, (ie., ranks) are at each process at the bottom of this tree of computation. You should time from the start of computing partial sums at the bottom of the tree computation until just before process 0 prints the final result. Take 3 timings and average the timings for 4 processes, and then for 8 processes.

                      0
                     /\
            0                  4
           /\                 /\
       0         2        4         6
      /\        /\        /\        /\  
    0    1    2    3    4    5    6    7
    

  5. Create a single graph that displays all of your timings for 4 and 8 processes. The horizontal axis should be the number of processes involved in the summation. The vertical axis should be the timings. The graph should include only your average timings, so there should be 6 points on the graph. A bar graph might be a nice depiction of this data. A different bar could be used for each of the three methods, for a given number of processes. You should use a tool that automatically creates the graph based on the data inputs. Two different approaches to generating these graphs from output of your MPI programs are given on the CISC 367 web site.

Hints for Getting Good Timing Data

In order to get reliable timing data, you need to make sure that no one else is using the cluster. To do this, you should run the spyall command just prior to making a performance run. Also, from past experience, leaving the timings until last evenings before the assignment due date makes getting reliable timing runs very difficult to obtain due to the many other folks trying to get access to the cluster for timings.

Experimental Report

Your experimental report should consist of the following sections in this order. It is strongly recommended that you type your report using a word processor rather than handing in a hand-written report.

  1. Cover Page:
    Title, author, course number and semester, date.
  2. Project Summary:
    In one paragraph, summarize the activities of the lab.

  3. Hypothesis:
    First, state and justify your hypothesis. Which method for summation do you believe should take the longest time and which should take the shortest time? Why? Explain your hypothesis in the context of a large number of processes and a small number of processes, and a large amount of computation at each node versus a small amount of computation at each node.
  4. Analysis:
    1. Collected Timing Data. This section includes both a chart of the raw collected timings, and the graph depicting the data pictorially.
    2. Analysis. In English, explain your timing results. Describe your observations concerning the different methods, and how it compares to your hypothesis. Explain why you believe the timings are different or similar between the methods. How does it compare to your hypothesis?

  5. Conclusions:
    This section consists of a discussion of what you learned from the various aspects of the lab. Discuss possible reasons for inconsistencies or discrepancies in your data versus what you would have expected to happen.
  6. General Remarks:
    Any comments on the lab itself that you would like to make in terms of suggestions for improvement or problems you experienced.

  7. Appendix:
    Your code for the three versions of summation.

Please staple all parts of your lab together, and label each piece. Be prepared to discuss your results on the day that the assignment is due.

About this document ...

This document was generated using the LaTeX2HTML translator Version 95.1 (Fri Jan 20 1995) Copyright © 1993, 1994, Nikos Drakos, Computer Based Learning Unit, University of Leeds.

The command line arguments were:
latex2html -split 0 lab2.tex.

The translation was initiated by Lori Pollock on Wed Feb 25 12:25:20 EST 1998


Lori Pollock
Wed Feb 25 12:25:20 EST 1998