CIS 367: INTRODUCTION TO PARALLEL COMPUTING
Spring 1998
Programming Assignment 5
Due Date: Start of class, Thursday, May 7, 1998
The objective of this assignment is to
code and experiment with using different data distributions on frequently
encountered
problems.
Textbook, chapter 10, Jacobi iteration.
- Write a transpose
function that takes at least a square matrix, n to indicate that it is an n by n
matrix, and an indication of the data distribution as arguments. The
transpose function should correctly perform a matrix transpose operation
on the matrix, given the indicated distribution. The transpose operation
is defined as follows:
Given a matrix A, where the element in row i and column j of A is denoted
by A(i,j), the transpose of A(i,j) = A(j,i). That is, the rows and columns
are interchanged.
Your transpose function should support both (block,*) and (*,block) distributions.
Hint:
You should use additional storage for the transpose of A; so you end up
with two matrices, A and the transpose of A,
distributed by the same distribution when you are done.
Another Hint:
Before digging in and using Send's and Receive's, think about all of the
collective operations we have learned.
- Use your functions from the previous assignment to initialize the matrix A
such that A(i,j) = i* 2, distribute A by (block,*),
print out the original A, and print both A and
the transpose of A after transpose is called, to print from each process,
indicating its rank and local part of A and transpose of A, respectively.
Also, initialize the matrix B such that B(i,j) = 2 * i, distribute B by
(*,block), and print out the original B and the transpose of B after
transpose is called, using your print routine from the last assignment
to print from each process as above.
Make sure your program takes the size n as a command line argument.
You should create space for the local distributed parts of the arrays
A and B and their transposes, according
to the distributions being used for each array, in order to conserve space on each processor as we did for the previous assignment.
- Compile, debug, and execute your program for n = 8, 32, and 128, each
with number of processes 4 and 8. Take timings of the
transpose operation only, for each data distribution. Record these timings.
A two-dimensional binary cellular automaton consists of a
two-dimensional rectangular
grid together with an initial assignment of zeroes and ones to the cells
of the grid and a "next-state" function for updating the values in the cells.
The next-state function is applied simultaneously to all the cells. Typically,
the value of the next-state function at an individual cell will depend on the
current value at the grid point and the four neighbors immediately
above, below, to the right, and to the left of the cell.
However, it is also quite
common for the next-state function to depend on the four
diagonally adjacent neighbors
-- upper-left neighbor, upper-right neighbor, lower-left neighbor,
and lower-right neighbor.
John Conway's game of Life is a well-known example. In it, cells with ones are "alive", while cells with zeroes are "dead". Updating corresponds
to finding the "next generation", and the new value of a cell depends on
all eight immediately adjacent neighbors (like a 9 point stencil without
the cell itself). The rules for the evolution of the system are:
- Living cells with one or zero live neighbors will die from loneliness!
- Living cells with two or three live neighbors will survive into the
next generation, quite happily.
- Living cells with four or more neighbors will die from
overcrowding!
- Dead cells with exactly three live neighbors will come to Life!
Your Task:
- Write a sequential program that simulates a two-dimensional binary cellular automaton.
Use Conway's Life as an example of a next-state function, but allow
for the possibility that the next-state function can be changed.
- Your program should take as input (or on the command line)
the order of the automaton (number of rows and
number of columns), the maximum number of generations, and
an initial configuration.
- Your program should print out the initial configuration in an easily
readable format, and then print the state of the system after each new
generation.
You can assume that the grid will be no larger than 32 by 32.
- For grading purposes, run your program with a grid of size 8 by 8,
with an initial configuration of row 4 and column 4 set alive with all other
cells dead.
Also, run your program with a grid of size 32 by 32, with an initial configuration
of row 15 and column 15 set alive with all other cells dead.
Run a script to show your runs, on a version of your program that
only prints the initial and final configuration.
Run it for 50 generations.
- Copy your sequential program into a new file, and modify it to
create a parallel version that uses (block,*) partitioning. Process 0 should
do all of the initialization and printing, but distribute the grid
by (block,*) distribution to each of the processors, let them compute the
system for their grid points, and then send their results to
process 0 at the end for printing.
Use your functions from the previous assignment to do the
distribution and printing.
You will have to change the system's computational loop to deal with
the fact that the grid is distributed.
- Run your code for number of processors = 4 and 8 with a grid of size 32
by 32 initialized such that row 15 and column 15 are alive only.
You will have to comment out your printing after each generation
and only have it print the final
configuration when you want to do the timing runs.
Only time
the grid computation, not the initialization and printing.
For grading purposes, only show the script of a run which prints the initial
configuration and final configuration.
Run it for 50 generations.
- Copy your file again, and modify it to create a parallel version that
uses (cyclic,*) partitioning, and perform the same steps as for (block,*)
above.
Hand in the programs, scripts of the runs, and writeup, as specified in each of
the tasks above.
Your experimental report should consist of the following sections in this
order. It is strongly recommended that you type your report using a
word processor rather than handing in a hand-written report.
-
Cover Page:
Title, author, course number and semester, date.
-
Project Summary:
In one paragraph, summarize the activities of the lab.
-
Matrix Transpose
Hypothesis:
State and justify your hypothesis regarding what you would
expect the relative performance of the two distributions to be for
the matrix transpose.
Note: you should make a hypothesis based on what you believe, BEFORE
you do the performance runs. You will not be marked off for an incorrect
hypothesis if it is justified adequately.
Analysis:
Report your timings for the two data distributions. Explain what you
see in the relative timings.
-
Game of Life
Hypothesis:
First, state and justify your hypothesis regarding what you would expect
the relative performance of each of the distributions
to be for this computation, for large grids.
Note: you should make a hypothesis based on what you believe, BEFORE
you do the performance runs. You will not be marked off for an incorrect
hypothesis if it is justified adequately.
Analysis:
- Collected Timing Data.
This section includes the raw collected timings
as requested above.
- Analysis.
Discuss your results. Be sure to discuss any relationship
between (1) the data distribution and the number of elements communicated during
the Game of Life computation, and (2) the data distribution and the execution
times for the Game of Life computation.
How does it compare to your hypothesis?
-
Conclusions:
This section consists of a discussion of what you learned from the various
aspects
of the lab. Discuss possible reasons for inconsistencies or discrepancies
in your data versus what you would have expected to happen.
-
General Remarks:
Any comments on the lab itself that you would like to make in terms of
suggestions for improvement or problems you experienced.
-
Appendix:
- Your code for the program containing the transpose function and
appropriate calls.
- Script of runs of the transpose program.
- Your code for the sequential and 2 parallel versions of the
Game of Life problem.
- Script of runs of these codes.
This document was generated using the LaTeX2HTML translator Version 95.1 (Fri Jan 20 1995) Copyright © 1993, 1994, Nikos Drakos, Computer Based Learning Unit, University of Leeds.
The command line arguments were:
latex2html -split 0 lab5.tex.
The translation was initiated by Lori Pollock on Wed Apr 29 07:39:20 EDT 1998
Lori Pollock
Wed Apr 29 07:39:20 EDT 1998