Data Structures, Algorithms, & Applications in C++
Chapter 12, Exercise 39
Suppose we have four runs to merge. Let the lengths of these four runs be
10, 2, 5, and 2. One way to merge these four runs into a single run
is to first merge runs 1 and 2 at a cost of 10 + 2 = 12;
then merge runs 3 and 4 at a cost of 5 + 2 = 7; and finally
merge the result of the last two merges at a cost of 12 + 7 = 19.
This merge pattern can be depicted as an extended binary tree in which two
runs are merged at each node; the external nodes represent
the initial runs; and the weights of the external nodes are the
lengths of the runs to be merged.
The weighted external path length of this binary tree
is the total cost of merging the runs using the merge pattern
defined by the binary tree.
A minimum-cost merge pattern can be found by determining
a binary tree whose external weights are the run lengths and whose
weighted external path length is minimum. A Huffman tree with
wieghts equal to the initial run lengths has minimum external
path length. Therefore the Huffman tree defines the minimum-cost
merge pattern.