Data Structures, Algorithms, & Applications in C++
Chapter 17, Exercise 37

(a)
We may prove that Prim's algorithm of Figure 17.13 always constructs a minimum-cost spanning tree by using the transformation technique used for the loading problem as well as for the proof of correctness of Kruskal's method. We need to establish the following: (1) Prim's method results in a spanning tree whenever a spanning tree exists, and (2) the spanning tree generated is of minimum cost.

Let G be any weighted undirected graph (i.e., G is an undirected network). From Section 17.9.3, we know that an undirected graph has a spanning tree iff it is connected. Prim's method fails only if there are no edges connecting vertices in TV with those not in TV, a condition that arises only if the graph is not connected.

Now let us prove that the constructed spanning tree T is of minimum cost. Since G has a finite number of spanning trees, it has at least one of minimum cost. Let U be such a minimum-cost spanning tree. Both T and U have exactly n-1 edges. If T = U, then T is of minimum cost and we have nothing to prove. Therefore, assume that T != U. Let k, k < n-1, be such that the first k edges added to T are also in U and the k+1th edge added to T is not in U.

We shall show that T and U have the same cost by transforming U into T. This transformation will be done in at most n-1-k steps. At each step the value of k is increased by at least 1. Further, the cost of U will not change as a result of the transformation. Consequently, after at most n-1-k steps of transformation U will have the same cost as the initial U and will consist of exactly those edges that are in T. Therefore, T is of minimum cost.

Each step of the transformation involves adding to U one edge, e, from T and removing one edge, f, from U. The edges e and f are selected in the following way:
  1. Let e be the k+1th edge added to T. By definition of k, this edge is not in U. Let TV be the set of vertices in the tree just before edge e is added. Let R be the set of remaining vertices. e joins a vertex in TV and one in R.
  2. When e is added to U, a unique cycle is created. Let f be an edge, other than e, on this cycle that joins a vertex in TV with one in R. Such an f must exist as the edges on this cycle, other than e, form a path between a vertex in TV and one in R. Note that f is not one of the first k+1 edges added to T because at the time e, which is the k+1th edge added to T, is added there is no edge in T that joins a vertex in TV and one in R.


From the way e and f are selected, it follows that V = U + {e} - {f} is a spanning tree and that at least the first k+1 edges added to T are in V. We need to show that the cost of V is the same as that of U. Clearly, the cost of V is the cost of U plus the cost of edge e minus the cost of edge f. If the cost of e is less than the cost of f, then the spanning tree V has a smaller cost than the tree U, which is impossible. If e has a higher cost than f, then f would have been added to T before e by Prim's algorithm. But it was not. So edges e and f must have the same cost. Hence V has the same cost as U.

(b)
Let e be the number of edges in the graph. We may assume that e >= n - 1; otherwise, the graph is not connected and so has no spanning tree. It is possible to implement Prim's method so as to have complexity O(e + n log n). Such an implementation requires the use of the adjacency list representation of a graph as well as the data structure Fibonacci Heap which we have not studied in this book. An O(e log n) implementation is possible when the graph is represented using adjacency lists and we employ a min heap to assist in the selection of edges. When an adjacency matrix is used in conjunction with a min heap, the resulting implementation has complexity O(n2 + e log n). Replacing the use of the modified min heap with a Fibonacci heap will result in an O(n2) complexity when adjacency matrices are used and an O(e + n log n) complexity when adjacency lists are used. By using the strategy employed in the single-source all-destinations shortest paths solution of Program 17.3 (i.e., maintain a list of vertices not yet included in the spanning tree, on each round select the vertex that is closest to a selected vertex), we obtain an O(n2) implementation. This O(n2) implementation is the best implementation to use when the graph has O(n2) edges. When the number of edges is O(n), for example, it is advantageous to use adjacency lists together with a min heap. When the number of edges is O(n1.5), for example, it is advantageous to use adjacency lists together with a Fibonacci heap.

Regardless of whether we use a Fibonacci heap, a min heap, or a chain of unselected vertices, the basic implementation strategy is the same. We begin by adding vertex 1 to the set of selected vertices, that is the vertices in TV. We go through n-1 stages. In each of these, a new vertex is selected. This vertex is the one that is nearest to an already selected vertex (i.e., it is a vertex that is not in TV and is connected to a vertex in TV with the least cost edge). To make this selection, we define, for every vertex v not in TV, a distance which equals the cost of the least cost edge that joins this vertex to any vertex in TV. At each stage, the vertex with least distance is selected.

Suppose that vertex v is selected, its inclusion into TV may reduce the distance values of its adjacent vertices that are currently not in TV. So we need to perform the operations: select the minimum distance vertex and decrease some distance values. Alhough these operations are done best using a Fibonacci heap, they are done fairly efficiently using either (1) a min heap which is augmented by an array location to keep track of the location in heap[] of the distance value of a vertex or (2) a chain and a distance array as used in the shortest paths solution of Program 17.3.