Data Structures, Algorithms, & Applications in C++
Chapter 18, Exercise 47
 (a)

We do this by using induction on
n
.
Induction Base
When n = 0
the binary tree has no internal
node and 1
external node.
For this tree E = I = n = 0
.
Therefore, E = I + 2n
.
Induction Hypothesis
Let m
be any integer >= 0
.
Assume that E = I + 2m
for all binary trees that
have m
internal nodes.
Induction Step
We will show that E = I + 2n
for all binary trees
that have m + 1
internal nodes.
Consider any binary tree T
that has
m + 1
internal nodes.
Remove any one of the internal nodes that is a leaf.
The resulting tree, T'
has
m
internal nodes.
From the induction hypothesis it follows that
E' = I' + 2m
where
E'
and
I'
are, respectively, the external and
internal path lengths of
T'
.
Suppose that the removed leaf was at level level
of T
.
It follows that
E = E' + level + 2
and that
I = I' + level
where
E
and
I
are, respectively, the external and
internal path lengths of
T
.
Therefore,
E = E' + level + 2
= I' + 2m + level + 2
= I  level + 2m + level + 2
= I + 2(m + 1)
 (b)

The internal path length of a binary tree that has
n
internl nodes is sum (i=1 to n) level_{i}
where level_{i}
is the length of the
path from the root to internal node i
.
In a complete binary tree one level_{i}
is
0
,
two level_{i}
s are
1
,
four level_{i}
s are
2
,
eight level_{i}
s are
4
, and so on.
Further, there is no binary tree that has more
level_{i}
s equal to
j
for any nonnegative
integer j
.
Therefore, a complete binary tree minimizes the internal path length.
 (c)

We will prove this equality by using induction on
n
.
Induction Base
When n = 0
the right side evaluates to
1*0  2^{1} + 2 = 0
,
which equals the internal path length of a binary tree
that has no internal nodes.
Induction Hypothesis
Let m
be any integer >= 0
.
Assume that the equality is correct for all binary trees that
have n = m
internal nodes.
Induction Step
We will show that I = (n+1)p  2^{p+1} + 2
for all binary trees
that have n = m + 1
internal nodes.
Consider any binary tree T
that has
m + 1
internal nodes.
Remove any one of the internal nodes that is a leaf.
The resulting tree, T'
has
m
internal nodes.
We consider the two cases
(1) p = floor(log_{2}(m+2)) =
floor(log_{2}(m+1))
and
(2) p = floor(log_{2}(m+2)) =
floor(log_{2}(m+1)) + 1
.
Case 1:
From the induction hypothesis it follows that the internal path length of
T'
is
(m+1)p  2^{p+1} + 2
. The internal path length
of
T
is p
more than that
of
T'
. Therefore, the internal path length of
T
is
(m+1)p  2^{p+1} + 2 + p
= (m+2)p  2^{p+1} + 2
.
Case 2:
From the induction hypothesis it follows that the internal path length of
T'
is
(m+1)(p1)  2^{p} + 2
. The internal path length
of
T
is p1
more than that
T'
. Furthermore,
2^{p} = m + 2
.
Therefore, the internal path length of
T
is
(m+1)(p1)  2^{p} + 2 + p  1
= (m+1)p  (m+1)  (m+2) + 2 + p  1
= (m+2)p  2(m+2) + 2
= (m+2)p  2^{p+1} + 2
 (d)

From part (a) we see that a binary tree than has minimum internal
path length also has minimum external path length. Therefore,
part (b) implies that complete binary trees have minimum external path length also.
Now from part (c) we get
(n+1)p  2^{p+1} + 2n + 2
as the minimum value of E
.
 (e)

From parts (a) and (b) it follows that a complete binary tree has
the smallest average external path. From part (d) we
see that this smallest average external path length is
((n+1)p  2^{p+1} + 2n + 2)/(n + 1) =;
p  2^{p+1}/(n + 1) + 2 = Omega(log n)
.
Therefore the average external path length of every binary tree
that has n
internal nodes is
Omega(log n)
.