Parallel Programming Environments

Introduction

To implement a parallel algorithm you need to construct a parallel program. The environment within which parallel programs are constructed is called the parallel programming environment. Programming environments correspond roughly to languages and libraries, as the examples below illustrate -- for example, HPF is a set of extensions to Fortran 90 (a "parallel language", so to speak), while MPI is a library of function calls.

There are hundreds of parallel programming environments. To understand them and organize them in a meaningful way, we need to sort them with regard to a classification scheme. In this note, we organize programming environments in terms of their core programming models. This is a complicated way to sort parallel programming environments, since a single programming environment can be classified under more than one programming model (for example, the Linda coordination language can be thought of in terms of a distributed-data-structure model or a coordination model).

In this note, the classifications are given, and the programming environments in each class are described in general terms. We then give a very small sampling of the most important programming environments for each category.

Formal models

Some programming environments are defined in terms of detailed, formal theories. These programming environments provide guarantees that less formal approaches can't make. For example, it is possible to build programming environments that are deterministic and that strictly preserve a program's sequential semantics. Unfortunately, formal programming environments use language constructs that are unfamiliar to traditional programmers, which makes for a barrier to their use.

Logic programming models

Programming environments based on logic programming use declarative as opposed to imperative semantics. The most common approaches are based on Prolog's first-order predicate calculus. Concurrency is included in one of three ways: and-parallelism (execute multiple predicates), or-parallelism (execute multiple guards), or through explicit mapping of predicates linked together through single assignment variables.

Functional programming models

Functional programming languages use declarative semantics and some form of lambda calculus to express the operation of a program. LISP is perhaps the oldest and best known of the functional languages. With pure functional languages, there are no side effects from a function. Hence, executing functions as soon as the required data is available provides a natural way to achieve concurrent execution.

Compositional models

These models are based on explicitly distinguishing among ways in which programs can be composed (put together to form larger programs). In our context, the most pertinent forms of composition are sequential (in which programs execute in sequence) and parallel (in which programs execute "concurrently", where concurrent execution is frequently modeled as interleaved execution of the elements of the composition).

Informal models

These models are pragmatic and are motivated by the experiences of parallel programmers as opposed to the theories of computer scientists. Consequently, these models don't categorize as neatly as the formal models do, and there will be some overlap between categories. For example, a coordination library (MPI) can be used to write SPMD data-parallel algorithms

Data-parallel models

Data-parallel programming models view a parallel computation in terms of a single instruction stream operating on multiple sets of data. The multiple data sets provide the source of the concurrency.

Coordination models

A coordination model [Mattson96] defines the mechanism used to coordinate the execution of distinct processes within a parallel program. We use the term coordination to refer to the three fundamental issues in concurrency: communication, synchronization, and process management. Coordination is explicit and occurs at distinct points within a program. Coordination models are usually combined with well-known sequential languages, so they have been very successful at attracting programmers. The most common way to use coordination models is to write SPMD programs with data-driven parallelism. Programming environments that use coordination models usually fall into one of two classes:

Communication-based coordination models

Communication between concurrent tasks takes place through the exchange of messages or some other discrete communication event. The semantics for the communication usually provides synchronization as well communication.

Shared-memory coordination models

Coordination is provided by the semantics of operations on shared data structures.

Object-based models

Objects are traditionally used to provide data abstraction. However, the same basic concept can be used to provide concurrency abstractions. The object-oriented system provides a framework within which a parallel program can be constructed.

Shared-memory models

These models replace communication with implicit data sharing through a single address space. Shared-memory programming models are the native model on multiprocessor systems using a symmetric operating system (the so-called SMP systems). Implicit communication between tasks, however, is attractive enough that researchers have extended the model to non-SMP systems as well. We can define two classes of shared-address-space programming environments:

Explicit-task shared-memory models

These systems have at their core the idea of very lightweight processes that share an address space. These processes are called threads. Since the address space is shared, these systems generally ignore communication and focus on ways to control the relations between threads and protect ordering constraints on data access. Creation and interaction between threads is explicit in these systems..

Implicit-task shared-memory models

Some shared-memory systems use higher-level constructs as opposed to explicitly controlled threads of execution. For example, a system might express concurrency in terms of distributed loop iterations or parallel sections.

A list of key programming environments

The following is a rough draft of a list of the most important parallel programming environments.

(Links are not currently available for all environments.)