Adding Parallelism to Dylan


WARNING: The following material is under construction. Large sections of the following material are out of date.

The Extensions

To make a parallel language from Dylan, we've needed to make changes in three areas:

Multiple Environments
For programming in a distributed memory model, each process must have a private copy of the program's environment. The processes communicate through explicit message passing. So we've changed the basic definition of things slightly: instead of a program defining a single environment, it defines an environment template, which describes the common basis of a collection of cooperating environments. One of these environments is called the root environment; the programs entry point is executed in the root environment. Within the module containing the basic classes for parallelism, we include a variable containing a vector of the environments that make up the current program.

Each object in the system exists in one of the programs environments. Normally, when an object is passed as an argument to a call in a remote environment, it is passed by value in a message to the remote environment.

Remote Objects
To allow useful distributed programming, some objects should not be passed by reference. These are primarily objects that for some reason must be bound to a particular environment (A good example would be an open file handle). To support these sorts of objects, we allow certain objects to be remote objects which, rather than being copied in argument calls, are passed using remote references.
Remote Calling
To allow multiple environments to cooperate, we need to provide some mechanism for calling procedures in remote environments. We provide a simple RPC based mechanism:

A procedure call which has remote references as arguments will be performed in the environment of the remote objects. If the remote arguments of a call do not exist in the same environment, then the call is erroneous, raising the same error as mapping over non-aligned collections. The call will be performed synchronously, blocking until the results of the remote call are returned.

Alternatively, an asynchronous remote call may be performed using the dispatch special form:

dispatch proc-call(args) => test-fn, val-fn :: <function>

Dispatch returns two functions. The first function returns a boolean result which indicates whether the remote call has completed or not. The second blocks until the remote call is complete, and then returns the result.

Composite Trees
The description of how composite trees will be implemented is extremely tentative and incomplete. In actuallity, much of what follows is a description of the lower-level method used to implement composite declarations. The actual declarations will be written using define composite and define node forms, which will be transformed into the form described below using Dylan macros.

Rationale for Recent Chagnes

Lots of stuff in the preceding explanation is under consideration for change. The biggest thing is eliminating the send and construct, and rebuilding the par construct to do something closer to map and reduce.

The alternative way of implementing parallelism that we're considering is to allow methods to be located on a particular processing node. Remote calls can be managed by assigning each method to an environment, and altering the method lookup system to include the ability to send a method to trigger the invocation of a method in the environment in which is resides.

To make this alternative work, we need to work out a set of method selection rules that will decide which environment the method belongs to. The problem comes about mainly in calls which involve multiple objects:

For example, consider the following call:

search(r1,x,#[1,2,3],r2)

In this code, suppose that r1 is a remote object living in environment env1, and r2 is a remote object living in environment r2. The semantics for this call could be defined in four different ways:

  1. execute a local search method in the current environment, with remote references to r1 and r2.
  2. execute the search method for the call in environment e1, with r2 as a remote argument and pass x and #[1,2,3] in a message.
  3. execute the search method for the call in environment e2, with r1 as a remote argument and pass x and #[1,2,3] in a message.
  4. reject the call as invalid since its remote arguments are not aligned.

My current inclination is the last: I like the semantics of it. Dylan tends to be fairly applicative value oriented; so it makes sense that objects that aren't specifically remote objects should be transportable in a message, and that the location where the call occurs should be the location where the remote objects are located.

One obvious limitation to the above is that a call involving a remote object will only be treated as a remote call if the object is a specializing argument.

Back to my work page

carroll@cis.udel.edu