Introduction to AlgorithmStructure Design Patterns

Pattern Name: Introduction

AlgorithmStructure Design Space

Intent:

This note is the introduction to the AlgorithmStructure design space. The patterns in this design space help the programmer organize exploitable concurrency (i.e., collections of tasks and shared data) into parallel algorithms.

Motivation:

To create a parallel program, you need two things. First, you must have concurrency: It must be possible to decompose your problem into tasks that can execute simultaneously. Second, you must be able to map these tasks onto units of execution (usually threads or processes) so you can take advantage of the concurrency in a parallel program.

The first of these requirements -- finding exploitable concurrency -- is addressed in the FindingConcurrency design space. After you work through the patterns in the FindingConcurrency design space, you will have decomposed your problem into tasks, ordered groups of tasks, and data that could be concurrently updated by the tasks.

The second requirement -- how to take those tasks/data/groups and map them onto units of execution -- is addressed in the present design space, which we call AlgorithmStructure. In a way, in this design space we are refining the parallel algorithm design from abstract concurrency into a form that can be realized on the target machine.

Applicability:

The patterns in this design space are used after the designer has decomposed the problem to expose exploitable concurrency; i.e., he or she has broken the problem down into one or more collections of tasks, task-local data, and global or shared data. Decomposing a problem to identify concurrency was addressed in the FindingConcurrency design space and will not be addressed here.

It is equally important to understand the issues not addressed by these patterns. At this stage of the pattern language, we are still dealing with high-level algorithms. Some issues pertaining to the final program and the target machine are considered, but for the most part, we are still dealing with high-level descriptions of the algorithms and not with specific implementation issues. These issues will be address in the lower two design spaces, SupportingStructures and ImplementationMechanisms.

Observe that the goal of this note is to introduce the patterns in the AlgorithmStructure design space. We won't discuss how the patterns are used or even how you select a given pattern. That is the topic of the ChooseStructure pattern.

Structure:

The patterns in this design space fall into three broad groups. The grouping reflects the major organizing principle used by the designer in understanding the parallel algorithm. For example, if the concurrency is best understood in terms of the way the data is decomposed, that data decomposition becomes the "major organizing principle". This idea will become clearer as we start working with the patterns. For now, we just want to introduce the major patterns in this design space and briefly discuss the role of composition or hierarchy in applying these patterns.

The patterns.

"Organize by ordering" patterns.

These patterns are used when the ordering of groups of tasks is the major organizing principle for the parallel algorithm. This group has two members, reflecting two ways task groups can be ordered. One choice represents "regular" orderings that do not change during the algorithm; the other represents "irregular" orderings that are more dynamic and unpredictable.

PipelineProcessing: The problem is decomposed into ordered groups of tasks connected by data dependencies.
AsynchronousComposition: The problem is decomposed into groups of tasks that interact through asynchronous events.

"Organize by tasks" patterns.

These patterns are those for which the tasks themselves are the best organizing principle. There are many ways to work with such "task-parallel" problems, making this the largest pattern group.

EmbarrassinglyParallel: The problem is decomposed into a set of independent tasks. Most algorithms based on task queues and random sampling are instances of this pattern.
SeparableDependencies: The parallelism is expressed by splitting up tasks among units of execution (threads or processes). Any dependencies between tasks can be pulled outside the concurrent execution by replicating the data prior to the concurrent execution and then reducing the replicated data after the concurrent execution. This pattern works when variables involved in data dependencies are written but not subsequently read during the concurrent execution.
ProtectedDependencies: The parallelism is expressed by splitting up tasks among units of execution. In this case, however, variables involved in data dependencies are both read and written during the concurrent execution and thus cannot be pulled outside the concurrent execution but must be managed during the concurrent execution of the tasks.
DivideAndConquer: The problem is solved by recursively dividing it into subproblems, solving each subproblem independently, and then recombining the subsolutions into a solution to the original problem.

"Organize by data" patterns.

These patterns are those for which the decomposition of the data is the major organizing principle in understanding the concurrency. There are two patterns in this group, differing in how the decomposition is structured (linearly in each dimension or recursively).

GeometricDecomposition: The problem space is decomposed into discrete subspaces; the problem is then solved by computing solutions for the subspaces, with the solution for each subspace typically requiring data from a small number of other subspaces. Many instances of this pattern can be found in scientific computing, where it is useful in parallelizing grid-based computations, for example.
RecursiveData: The problem is defined in terms of following links through a recursive data structure.

The ChooseStructure pattern.

This design space also includes the ChooseStructure pattern, which addresses the question of how to select an appropriate pattern.

The role of composition/hierarchy.

Finally, at this point we need to discuss the hierarchical nature of the AlgorithmStructure patterns. In many cases, a single AlgorithmStructure pattern will not let you take advantage of all the available concurrency. For example, a problem may naturally map into a pipeline of task groups. This is usually very coarse-grained, however, and to take full advantage of the concurrency available in the problem, the task groups themselves may need to be parallelized. In all likelihood, these task groups would map onto other AlgorithmStructure patterns, leading to a hierarchical design with the pipeline pattern at the top organizing the task-groups and other patterns within each task group.

A second example may help drive this important point home. The Accelerated Strategic Computing Initiative (ASCI) is a large program to use extreme-scale computers with thousands of processors to solve complex simulation problems. When you need to run on so many processors, you can't let any concurrency go to waste. Hence, most of the ASCI algorithms are hierarchical, the most common combinations being a GeometricDecomposition pattern for the high-level program organization and a ProtectedDependencies pattern for the update of local data.