laser.mpi.util.shuffle
Class CycleShiftShuffler

java.lang.Object
  |
  +--laser.mpi.util.shuffle.Shuffler
        |
        +--laser.mpi.util.shuffle.CycleShiftShuffler

public class CycleShiftShuffler
extends Shuffler

A CycleShiftShuffler provides functionality to move around elements of a distributed array using MPI. The context is as follows: each process in an MPI communicator has a local array which we will call localData. The user wishes to move these elements around, within and between processes. To do this, the user provides a map which tells precisely what is to be moved where. The map is specified using "coordinates." A coordinate is a pair of integers (proc, locPos), where proc is the process ID (or "rank" in MPIese), and locPos is the index in the localData array on proc. The map then consists of a list of pairs (source, target) of coordinates. The source is the coordinate for the item one wishes to move, and target is the coordinate for the position one wishes to move it to.

The relation defined by the map must be an injective function. For we do not allow one to move an item to two different locations, nor do we allow moving two distinct items to the same location. However, it is not required that the function be defined on the entire domain (i.e., on all possible coordinates), nor is it required that the function be onto. Therefore, it is possible to move an item from position p to position q, but not to move anything to position p. In this case, the data at position p will not be erased---it will still remain there, along with another copy at position q.

This is a concrete class extending the abstract class Shuffler. The algorithm defined here (described below) is a very efficient one which uses the minimum possible number of Sends and Receives to accomplish the reshuffling. However, the computation required to achieve this is somewhat more complicated than, for example, that of the TranspositionShuffler.

The usual way to use this class is to invoke the static method shuffle. Every proc in the communicator must invoke this method, and with the same map.

This class has been written using the mpiJava package. You must have mpiJava installed to compile and execute this class. If .../mpiJava/lib/classes is in your CLASSPATH you should be able to compile this class normally using javac. To execute from the command line you say, for example, "prunjava 4 laser.mpi.util.shuffle.CycleShiftShuffler". prunjava is a script which comes with mpiJava, and I am assuming it is in your PATH. This executes some tests which are in the test suite. The 4 means to create 4 procs. You can use any positive integer you want in place of 4, and the tests should all pass. For each test, you will see the map, the localData arrays before shuffling, and the localData arrays after shuffling. If everything comes out as expected, this is all you will see. If anything goes wrong, you will also get a message telling you what went wrong. If this happens, please email me with a bug report.

The algorithm: the basic idea is the following. Suppose X is a set. Let M be the set consisting of all pairs (Y,f) where Y is a subset of X and f:Y->X is an injective function. Given two elements of M you can compose them to produce a third: (U,f)(V,g) = (W,h) where W={v in V | g(v) is in U}, and h(w) = f(g(w)) for all w in W. This operation is associative, and there is an identity element (X,e), where e is the identity function on X. So M is a monoid. Notice that not every element of M is invertible; in fact, the invertible elements are precisely those of the form (X,f), where f is a bijection. Hence the group of invertible elements are in 1-1 correspondence with Sym(X), the symmetric group on X.

Suppose X is finite. We describe two different classes of elements of M. Let x_1,x_2,...,x_n be n distinct elements of X. The first element is cycle(x_1,...,x_n) = (X,f) where f(x_1)=x_2, f(x_2)=x_3, ..., f(x_n)=x_1, and f(x)=x for all x not in {x_1,...,x_n}. This is an invertible element.

The second element is shift(x_1,...,x_n) = (X-{x_n},g) where g(x_1)=x_2, g(x_2)=x_3, ..., g(x_{n-1})=x_n, and g(x)=x for all x not in {x_1,...,x_n}. This is not an invertible element, since x_n is not in the domain.

If a is a cycle or shift on x_1,...,x_n and b is a cycle or shift on y_1,...,y_m, we say that a and b are disjoint if {x_1,...,x_n} does not intersect {y_1,...,y_m}.

Now here is the main fact:

THM. Every element of M can be written as a product (unique, up to order) of disjoint cycles and shifts. An element is invertible iff its decomposition in this way contains no shifts.

This generalizes the well-known fact that every permutation can be written as a product of disjoint cycles.

In our case, X is the set of all coordinates which are mentioned as sources or targets in the map. The map itself may be thought of as an element of our monoid M. The domain Y of the map is the set of all coordinates which are sources. We decompose the map into the disjoint cycles and shifts. (The decomposition is unique, up to order.) We then perform the reshuffling on each factor separately. It's not too hard to see how to deal with a single cycle or shift. We will describe the situation for shift; the situation for cycles is only slightly more complicated.

First, we break up the shift into disjoint segments. A segment is a maximal interval in which all of the coordinates have the same proc. Within a segment, there is no need to do any MPI Sends or Receives, since we only need to move the data around locally. Now all it comes down to is this: in the first segment, Send the last element to the proc of the second segment; in the second segment, receive from the proc of the previous segment while sending the last item to the proc of the next segment, etc. In each segment, after data has been sent, the local shift can take place.

Everything here is implemented using only ordinary MPI_Send, MPI_Recv, and MPI_Sendrecv. No wildcards are used (i.e., MPI_ANY_SOURCE or MPI_ANY_TAG). Hence this program is locally deterministic (see Stephen Siegel and George Avrunin, Analyzing MPI, 2003). This means that, for any given input, freedom-from-deadlock and other correctness properties may be ascertained by analyzing a single execution.


Field Summary
private  Node[][] nodes
          An array of length nProcs of arrays of Nodes.
 
Fields inherited from class laser.mpi.util.shuffle.Shuffler
debug, nSendrecvs, nSends
 
Constructor Summary
CycleShiftShuffler()
          Constructs a trivial instance of this class with name "Trivial CycleShiftShuffler".
CycleShiftShuffler(java.lang.String name, mpi.Intracomm comm, int tag, Mapping[] map, java.lang.Object[] localData)
          Constructs a new instance of this class with the given name, communicator, tag, map and localData.
 
Method Summary
(package private)  Node getCycle(Node first)
          A method invoked by getNextFactor after it has been determined that the factor is a cycle.
(package private)  Node getNextFactor()
          Returns the next factor (which is either a cycle or a shift) from the canonical factorization of the map.
(package private)  Node getShift(Node first)
          A method invoked by getNextFactor when it has been determined that the factor is a shift.
static void main(java.lang.String[] args)
          This simply executes a number of tests described in the test cases.
 Shuffler newInstance(java.lang.String name, mpi.Intracomm comm, int tag, Mapping[] map, java.lang.Object[] localData)
          Creates a new instance with the given name, communicator, tag, map and localData.
 Node[][] nodes()
          Returns the nodes array.
 void printMap(java.io.PrintStream out)
          Prints map, and, if debugging, prints node structures as well.
 void printNodes(java.io.PrintStream out)
          Prints out a readable view of the nodes array on proc 0 to the given PrintStream out.
 void printState(java.io.PrintStream out)
          Prints everything to the given PrintStream: the map, the nodes, and the localData.
 void reorder()
          This is the main method which starts the reordering.
(package private)  void reorderFactor(Node start)
          This method shuffles the data based on a given factor.
(package private)  void shift(Node segment)
          This method actually shifts all the local data on a single proc according to a given segment.
static void shuffle(mpi.Intracomm comm, int tag, Mapping[] map, java.lang.Object[] localData)
          This is the main static method which is the usual way for using this class.
 
Methods inherited from class laser.mpi.util.shuffle.Shuffler
addSends, comm, debug, debug, globalNumSendrecvs, globalNumSends, lengths, localData, map, myAssert, myRank, name, nProcs, nSendrecvs, nSends, printArray, printLocalData, tag
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

nodes

private Node[][] nodes
An array of length nProcs of arrays of Nodes. This structure is built up using the information provided by map. The i-th element of this array is an array of length lengths[i]. The j-th element of that array is a Node which corresponds to the coordinate (i,j) (i.e., position j of localData on proc i). That Node contains information such as the the next Node, which is obtained from the map, and the previous Node, which is obtained by inverting the map.

Constructor Detail

CycleShiftShuffler

public CycleShiftShuffler(java.lang.String name,
                          mpi.Intracomm comm,
                          int tag,
                          Mapping[] map,
                          java.lang.Object[] localData)
                   throws mpi.MPIException
Constructs a new instance of this class with the given name, communicator, tag, map and localData. Keep in mind that each proc in the MPI system must create its own instance of this class, and when they do so they should all provide the same map. It is the user's responsibility to make sure this is so, or else anything could happen!

This constructor checks the map to make sure it defines an injective function and that there are no references into localData which exceed the length of that array. It also constructs the lengths array, and the nodes array, and fills in some of the fields in each node.

Parameters:
name - a name to be given to this Shuffler
comm - the MPI communicator to be used for all MPI calls
tag - the MPI tag to be used for all MPI communication
map - the array of Mappings describing how to shuffle the data
localData - the local data array for this proc
Throws:
mpi.MPIException - if an MPI call goes awry
java.lang.IllegalArgumentException - if name, comm, map, or localData is null, or if tag == MPI.ANY_TAG

CycleShiftShuffler

public CycleShiftShuffler()
                   throws mpi.MPIException
Constructs a trivial instance of this class with name "Trivial CycleShiftShuffler". The communicator will be MPI.COMM_WORLD, the tag 0, the mapping will be empty (have length 0), and the localData will be an array of length 0 of Object.

Method Detail

shuffle

public static void shuffle(mpi.Intracomm comm,
                           int tag,
                           Mapping[] map,
                           java.lang.Object[] localData)
                    throws mpi.MPIException
This is the main static method which is the usual way for using this class. The user provides the communicator, tag, map, and the localData, and upon return the data will be moved around according to the instructions in the map. This method is to be called by all procs in the communicator, and with the same map and communicator, (although obviously with different localDatas).

It is best to choose a tag which is not being used by other program components.

Parameters:
comm - The MPI communicator
tag - the MPI tag to be used for all MPI communication in this class
map - the array of Mappings telling what is to be moved where
localData - the local data array for this proc
Throws:
mpi.MPIException - if an MPI call goes awry
java.lang.IllegalArgumentException - if name, comm, map, or localData is null, or if tag == MPI.ANY_TAG

main

public static void main(java.lang.String[] args)
                 throws mpi.MPIException
This simply executes a number of tests described in the test cases.

Parameters:
args - the command line arguments
Throws:
mpi.MPIException - if an MPI call goes awry

newInstance

public Shuffler newInstance(java.lang.String name,
                            mpi.Intracomm comm,
                            int tag,
                            Mapping[] map,
                            java.lang.Object[] localData)
                     throws mpi.MPIException
Creates a new instance with the given name, communicator, tag, map and localData. Keep in mind that each proc in the MPI system must create its own instance of this class, and when they do so they should all provide the same map. It is the user's responsibility to make sure this is so, or else anything could happen!

This method checks the map to make sure it defines an injective function and that there are no references into localData which exceed the length of that array.

Specified by:
newInstance in class Shuffler
Parameters:
name - a name to be given to this Shuffler
comm - the MPI communicator to be used for all MPI calls
tag - the MPI tag to be used for all MPI communication
map - the array of Mappings describing how to shuffle the data
localData - the local data array for this proc
Throws:
mpi.MPIException - if an MPI call goes awry
java.lang.IllegalArgumentException - if name, comm, map, or localData is null, or if tag == MPI.ANY_TAG

nodes

public Node[][] nodes()
Returns the nodes array. This is an array of length nProcs of arrays of Nodes. This structure is built up using the information provided by map. The i-th element of this array is an array of length lengths[i]. The j-th element of that array is a Node which corresponds to the coordinate (i,j) (i.e., position j of localData on proc i). That Node contains information such as the the next Node, which is obtained from the map, and the previous Node, which is obtained by inverting the map.

Returns:
the nodes array

printNodes

public void printNodes(java.io.PrintStream out)
Prints out a readable view of the nodes array on proc 0 to the given PrintStream out. If myRank != 0, this returns without doing anything.

Parameters:
out - the PrintStream to which to print the map.

printState

public void printState(java.io.PrintStream out)
                throws mpi.MPIException
Prints everything to the given PrintStream: the map, the nodes, and the localData.

Parameters:
out - the PrintStream to which to print
Throws:
mpi.MPIException - if something goes wrong with MPI while dealing with the localData.

printMap

public void printMap(java.io.PrintStream out)
Prints map, and, if debugging, prints node structures as well.

Overrides:
printMap in class Shuffler
Parameters:
out - the PrintStream to which to print

getNextFactor

Node getNextFactor()
Returns the next factor (which is either a cycle or a shift) from the canonical factorization of the map. The factor is represented by giving the first node in that factor. The rest of the nodes may be reached by getting the next node, etc. This also finishes setting all the fields in the Node objects in this factor, namely, reached, nextSeg, and segEnd.

Returns:
a Node representing the next factor

getCycle

Node getCycle(Node first)
A method invoked by getNextFactor after it has been determined that the factor is a cycle. The parameter first may be any node in the cycle. The method returns an actual first node in the cycle, which means that, if the cycle has more than one segment, the first node will be at the beginning of a segment.

Parameters:
first - a Node in the cycle to use for starts
Returns:
a possibly different Node which is guaranteed to begin a segment if this cycle has more than one segment

getShift

Node getShift(Node first)
A method invoked by getNextFactor when it has been determined that the factor is a shift. The parameter first must be the first element in the shift. The method just returns first, after completing the filling in of the Node fields.

Parameters:
first - the first node in this shift
Returns:
first (the same node it was given)
Throws:
java.lang.RuntimeException - if first is null
java.lang.RuntimeException - if first.segStart is false
java.lang.RuntimeException - if first.prev is null

shift

void shift(Node segment)
This method actually shifts all the local data on a single proc according to a given segment. The parameter segment must be a node which begins a segment. If we call the elements of the segment a, b, c, d, then what this does is: copy c to d, then copy b to c, then copy a to b. Notice that d gets overwritten, so you should not invoke shift until you have done what you wanted with d (usually that means you have sent d to another proc). The segment may or may not be a complete cycle. This method should only be called if myRank == segment.proc (else an exception will be thrown).

Parameters:
segment - the first Node in a segment
Throws:
java.lang.RuntimeException - if segment is null
java.lang.RuntimeException - if segment.segStart is false
java.lang.RuntimeException - if segment.segEnd is null
java.lang.RuntimeException - if myRank != segment.proc

reorder

public void reorder()
             throws mpi.MPIException
This is the main method which starts the reordering. It simply loops through all factors, invoking reorderFactor on each.

Specified by:
reorder in class Shuffler
Throws:
mpi.MPIException - if something goes awry with the MPI

reorderFactor

void reorderFactor(Node start)
             throws mpi.MPIException
This method shuffles the data based on a given factor. The factor is represented by its first Node, start. The factor may be a cycle or a shift.

Parameters:
start - the start Node for a cycle or shift
Throws:
mpi.MPIException - if something goes awry with MPI