CIS 367: INTRODUCTION TO PARALLEL COMPUTING
Spring 1998
Programming Assignment 5

Due Date: Start of class, Thursday, May 7, 1998

Objectives

The objective of this assignment is to code and experiment with using different data distributions on frequently encountered problems.

Reference

Textbook, chapter 10, Jacobi iteration.

Matrix Transpose

Write a transpose function that takes at least a square matrix, n to indicate that it is an n by n matrix, and an indication of the data distribution as arguments. The transpose function should correctly perform a matrix transpose operation on the matrix, given the indicated distribution. The transpose operation is defined as follows:
Given a matrix A, where the element in row i and column j of A is denoted by A(i,j), the transpose of A(i,j) = A(j,i). That is, the rows and columns are interchanged.
Your transpose function should support both (block,*) and (*,block) distributions. Hint: You should use additional storage for the transpose of A; so you end up with two matrices, A and the transpose of A, distributed by the same distribution when you are done. Another Hint: Before digging in and using Send's and Receive's, think about all of the collective operations we have learned.
Use your functions from the previous assignment to initialize the matrix A such that A(i,j) = i* 2, distribute A by (block,*), print out the original A, and print both A and the transpose of A after transpose is called, to print from each process, indicating its rank and local part of A and transpose of A, respectively. Also, initialize the matrix B such that B(i,j) = 2 * i, distribute B by (*,block), and print out the original B and the transpose of B after transpose is called, using your print routine from the last assignment to print from each process as above. Make sure your program takes the size n as a command line argument. You should create space for the local distributed parts of the arrays A and B and their transposes, according to the distributions being used for each array, in order to conserve space on each processor as we did for the previous assignment.
Compile, debug, and execute your program for n = 8, 32, and 128, each with number of processes 4 and 8. Take timings of the transpose operation only, for each data distribution. Record these timings.

The Game of Life

A two-dimensional binary cellular automaton consists of a two-dimensional rectangular grid together with an initial assignment of zeroes and ones to the cells of the grid and a "next-state" function for updating the values in the cells. The next-state function is applied simultaneously to all the cells. Typically, the value of the next-state function at an individual cell will depend on the current value at the grid point and the four neighbors immediately above, below, to the right, and to the left of the cell. However, it is also quite common for the next-state function to depend on the four diagonally adjacent neighbors -- upper-left neighbor, upper-right neighbor, lower-left neighbor, and lower-right neighbor.

John Conway's game of Life is a well-known example. In it, cells with ones are "alive", while cells with zeroes are "dead". Updating corresponds to finding the "next generation", and the new value of a cell depends on all eight immediately adjacent neighbors (like a 9 point stencil without the cell itself). The rules for the evolution of the system are:

Living cells with one or zero live neighbors will die from loneliness!
Living cells with two or three live neighbors will survive into the next generation, quite happily.
Living cells with four or more neighbors will die from overcrowding!
Dead cells with exactly three live neighbors will come to Life!

Your Task:

Write a sequential program that simulates a two-dimensional binary cellular automaton. Use Conway's Life as an example of a next-state function, but allow for the possibility that the next-state function can be changed.
1. Your program should take as input (or on the command line) the order of the automaton (number of rows and number of columns), the maximum number of generations, and an initial configuration.
2. Your program should print out the initial configuration in an easily readable format, and then print the state of the system after each new generation. You can assume that the grid will be no larger than 32 by 32.
3. For grading purposes, run your program with a grid of size 8 by 8, with an initial configuration of row 4 and column 4 set alive with all other cells dead. Also, run your program with a grid of size 32 by 32, with an initial configuration of row 15 and column 15 set alive with all other cells dead. Run a script to show your runs, on a version of your program that only prints the initial and final configuration. Run it for 50 generations.
Copy your sequential program into a new file, and modify it to create a parallel version that uses (block,*) partitioning. Process 0 should do all of the initialization and printing, but distribute the grid by (block,*) distribution to each of the processors, let them compute the system for their grid points, and then send their results to process 0 at the end for printing. Use your functions from the previous assignment to do the distribution and printing. You will have to change the system's computational loop to deal with the fact that the grid is distributed.
Run your code for number of processors = 4 and 8 with a grid of size 32 by 32 initialized such that row 15 and column 15 are alive only. You will have to comment out your printing after each generation and only have it print the final configuration when you want to do the timing runs. Only time the grid computation, not the initialization and printing. For grading purposes, only show the script of a run which prints the initial configuration and final configuration. Run it for 50 generations.
Copy your file again, and modify it to create a parallel version that uses (cyclic,*) partitioning, and perform the same steps as for (block,*) above.

Hand in the programs, scripts of the runs, and writeup, as specified in each of the tasks above.

Experimental Report

Your experimental report should consist of the following sections in this order. It is strongly recommended that you type your report using a word processor rather than handing in a hand-written report.

Cover Page:
Title, author, course number and semester, date.
Project Summary:
In one paragraph, summarize the activities of the lab.
Matrix Transpose

Hypothesis: State and justify your hypothesis regarding what you would expect the relative performance of the two distributions to be for the matrix transpose. Note: you should make a hypothesis based on what you believe, BEFORE you do the performance runs. You will not be marked off for an incorrect hypothesis if it is justified adequately.
Analysis: Report your timings for the two data distributions. Explain what you see in the relative timings.
Game of Life

Hypothesis: First, state and justify your hypothesis regarding what you would expect the relative performance of each of the distributions to be for this computation, for large grids. Note: you should make a hypothesis based on what you believe, BEFORE you do the performance runs. You will not be marked off for an incorrect hypothesis if it is justified adequately.
Analysis:
1. Collected Timing Data. This section includes the raw collected timings as requested above.
2. Analysis. Discuss your results. Be sure to discuss any relationship between (1) the data distribution and the number of elements communicated during the Game of Life computation, and (2) the data distribution and the execution times for the Game of Life computation. How does it compare to your hypothesis?
Conclusions:
This section consists of a discussion of what you learned from the various aspects of the lab. Discuss possible reasons for inconsistencies or discrepancies in your data versus what you would have expected to happen.
General Remarks:
Any comments on the lab itself that you would like to make in terms of suggestions for improvement or problems you experienced.
Appendix:
1. Your code for the program containing the transpose function and appropriate calls.
2. Script of runs of the transpose program.
3. Your code for the sequential and 2 parallel versions of the Game of Life problem.
4. Script of runs of these codes.

About this document ...

The command line arguments were:
latex2html -split 0 lab5.tex.

The translation was initiated by Lori Pollock on Wed Apr 29 07:39:20 EDT 1998

Lori Pollock
Wed Apr 29 07:39:20 EDT 1998

Objectives

Reference

Matrix Transpose

The Game of Life

Experimental Report

Cover Page:

Project Summary:

Matrix Transpose

Game of Life

Conclusions:

General Remarks:

Appendix:

About this document ...