Pattern Name: TaskDecomposition

FindingConcurrency Design Space


Intent:

This pattern addresses the question "How do you decompose a problem into tasks that can execute concurrently?"

Also Known As:

Motivation:

This pattern addresses the issues raised during a primarily task-based decomposition. The key to an effective task decomposition is ensuring that the tasks are sufficiently independent that maintaining dependencies takes only a small fraction of the program's overall execution time. It is also important to ensure that the execution of the tasks can be equally shared among the ensemble of processing elements (the so-called load-balancing problem).

Applicability:

Use this pattern when

Implementation:

It is very rare that a task-based decomposition can be carried out automatically. You have to do this by hand based on your knowledge of the problem and the code required to implement it.

In a task-based decomposition, you look at your problem as a collection of distinct tasks, paying particular attention to

As a first pass, try to identify as many tasks as possible; it is much easier to start with too many tasks and merge them later on than to start with too few tasks and later try to split them. The key is that the individual tasks execute for the most part independently of each other.

You can find tasks in many different places:

This is only a partial list of where you can find tasks. Keep in mind the competing forces mentioned in the DecompositionStrategy pattern:

Once you have your tasks, you need to look at the data decomposition implied by the tasks. The DataDecomposition pattern may help you with this analysis.

Examples:

Medical imaging.

Consider the medical imaging problem described earlier (in the Examples section of the DecompositionStrategy pattern). In this application, a point inside a model of the body is selected randomly, a radioactive decay is allowed to occur at this point, and the trajectory of the emitted particle is followed. To create a statistically significant simulation, thousands if not millions of trajectories are followed.

It is natural to associate each trajectory with a task. These tasks are particularly simple to manage concurrently since they are completely independent. Another important point is to make sure there are enough of them so we can support a range of computer systems, from those with a small number of processing elements to massively parallel computers.

With the basic tasks in hand, we can now consider the corresponding data decomposition and define what portions of the problem space need to be associated with each task. In this case, each task needs to hold the information defining the trajectory at any point, but that is all. The more challenging issue is the model of the body. You can't deduce this from the description of the problem we've given, but it turns out that the body model can be huge. Since it is a read-only model, there is no problem if there is an effective shared-memory system, since each task can read data as needed. If the target system uses distributed memory, however, it may be necessary to replicate the body model on each processing element. This can be very time-consuming and can waste a great deal of memory. Thus, for such target systems a data-based decomposition, described in the Examples section of the DataDecomposition pattern, may work better.

Matrix multiplication.

Consider the standard multiplication of two matrices (C = A · B). We can produce a task-based decomposition of this problem by considering the calculation of each element of the product matrix as a separate task. Each task needs access to one row of A and one column of B. This decomposition has the advantage that all the tasks are independent, and because all the data that is shared among tasks (A and B) is read-only, it will likely be straightforward to implement in a shared-memory environment. In a distributed-memory environment, however, the requirement that each task have access to a row of A and a column of B may lead to excessive memory use and/or communication, so this decomposition may not be effective. See the Examples section of the DataDecomposition pattern for a discussion of other decompositions more suited to distributed-memory environments. (In fact, a data-based decomposition may be more effective for most current platforms -- the above task-based decomposition works well only if access to memory elements is roughly uniform, which is not the case with cache-based computers.)

Known Uses:

Task-based decompositions are used in many applications. The distance geometry code (DGEOM) described in [Mattson96] uses a task based decomposition.