Some adaptations were made to write mathematical notation on
the keyboard:
1. "a^b" means
"a to the power b" or "a with superscript b".
2. "a_b" means
"a with the subscript b".
3. "<="
means "less than or equal to".
4. "Theta"
is the Greek capital theta.
Purposes of the Course
- Learn to estimate and use information on running time:
This is covered in Chapters 1-4 and in Appendix B. There
are a couple of points that could be emphazied more. First,
worst-case run time can be more important that average-run time
if the algorithm is used in a time-critical application such
as controlling a robot's response to pontentially dangerous situations. Second,
although our study of average-case run times assumes all initial
situations are equally likely, this assumption may fail.
In that case, we can be misled. For example, the average
running time for Quicksort is better than for Mergesort but if
input is frequently sorted, then Quicksort may be slower.
- Learn to recognize different types of algorithms and their
strengths and weaknesses: Some information on this
is given at the start of the notes for Chapters 2-4.
- Be introduced to the subject of obtain lower bounds for
problems: See Chapters 7 and 8.
- Learn some things about designing algorithms:
Mostly this is learning by studying algorithms. In general,
designing an efficient algorithm for a new problem is difficult. There
are some general principles to keep in mind.
- Try to solve the problem by thinking in terms of an algorithm
type.
- Be open to the idea of keeping track of additional data in
designing an algorithm. Examples where this is needed:
Exercises 2.42 and 3.33, the D^(k)[i][j]
arrays in Floyd's algorithm (without them, there's no smaller
problem to build on for dynamic programming!), the touch[ ] array in Dijkstra's algorithm
(although this could be considered just a matter of efficiency)
- If numbers are involved, sorting may improve efficiency. (Kruskal's
algorithm sorts edges, sorting is done in the 0-1 Knapsack problem
to obtain an order for making decisions, and there are many other
examples)
- Keeping track of additional data may improve efficiency.
- Appropriate data structures may improve efficiency.
This may be as simple as a sorted list (noted earlier), or may
be more complex as in Kruskal's algorithm.
Brief Notes on Neapolitan and Naimipour Text
Ch. 1 Ch. 2 Ch.
3 Ch. 4 Ch. 5 App.
A App. B
You can step through some of the algorithms
in the text by going to http://students.ceid.upatras.gr/~papagel/, clicking on "Algorithms Tutoring Web Page",
and choosing an algorithm. (Thanks to Long Huynh, 188 student,
Spring 2000.)
- page vii: Read to learn about the nature
of the text.
- page viii: Read to get a quick overview of the
text.
Chapter
1
- page 26: You can do arithmetic with O(...)
notation beyond what the text discusses. I'll
discuss it very briefly in class since it appears in the literature.
- page 37: Also useful: If g(n)
is in o(f(n)), then f(n)
+ rg(n) is in Theta(n) for all
real r. (#7 in the text can only be used when
r > 0.)
- page 37: Also useful: n! is in
Theta((n/e)^(n+1/2)).
- For analyzing algorithms, it is useful to know that
1^k + 2^k + ...+ n^k is a polynomial
of degree k+1 whose leading term is
n^(k+1)/(k+1). This is true even if
we omit some terms at the start of the sum and add or omit some
at the end. For example, the sum could start at 3^k
and end at (n+1)^k. It follows that the sum
is n^(k+1)/(k+1) plus something
in o(n^(k+1)). To illustrate, if we
must sum k(n-k), we can think of this as
"n times the sum of k"
minus "the sum of k^2". This
gives us n(n^2/2) - n^3/3 =
n^3/6 plus something in o(n^3). Hence
the sum is in Theta(n^3).
Chapter
2
- Overview of Divide and Conquer: Works well when
problem can be divided into a few significantly smaller problems
and the solutions combined into a solution for the original problem
without extensive effort. Determining run-time complexity class
leads to a recursion. Obtaining a recursion is often easy, but
not always. The divide and conquer approach is referred to as
top down.
- page 51: Be careful in reading the analysis of
the binary search algorithm. When n is a even
(e.g., a power of 2), the list is split into two unequal pieces
with the right-hand piece being the larger and containing exactly
n/2 elements.
- Sec. 2.7: One problem with mostly theoretical
calculations as done in this section is that you may leave something
out; for example, overhead in calling a routine; or you may miscalculate
the time for the basic operation. An alternative approach
is to get actual times if the operating system allows you to
do that. Fortunately, finding the exact value
of n at wich the threshold occurs is seldom important
because the times for the two algorithms are usually close together
for a range of values of n near the threshold. You
can see this in Table 2.5 on page 82
Chapter
3
- Overview of Dynamic Programming: Works well
when solution can be built up step by step from solution to smaller
problems, frequently using the principle of optimality (p. 103).
The approach is referred to as bottom up. In contrast to divide
and conquer, dynamic programming code contains loops instead
of recursive calls. Analysis is usually fairly easy.
- Section 3.1: Algorithms 3.1 and 3.2 can be improved
for large k by using the first equality in the section
to note that the binomial coefficient is unchanged when k
is replaced by n-k, whence we may assume that k <= n/2.
- page 91: How bad can this algorithm be? The
binomial coefficient is largest when k is the floor of
n/2. It can be shown that the algorithm computes
Theta(2^n/sqrt(n))
binomial coefficients. Compare with the worst case of Algorithm 3.2.
- Section 3.2: It may have been nicer if the superscripts
on D[i,j] were one larger because:
- The superscript would be the maximum number of edges allowed
in looking for shortest paths from i to j.
- It would agree with the notation in graph theory.
- It would agree with the notation in matrix theory because
then D^(k) = W^k
where W is thought of as a matrix but matrix multiplication
is done differently: "times" becomes "plus"
and "plus" becomes "minimum". For
example, in AxC,
a[i,1] c[1,
j] + ... + a[i, n] c[n,
j] becomes min(a[i,1]+c[1,
j], ..., a[i, n]+c[n,
j]).
- page 97: In the first line of Example 3.2, "exemplary"
has one of its secondary meanings "serving as a model or
example", not its primary meaning "commendable".
- page 99: Before considering Floyd's algorithm,
we should look at the naive one: Compute W^k
by the modified matrix product discussed above. If
we do that the naive way, there are n-1 matrix products
and each product takes Theta(n^3), giving a Theta(n^4)
every case running time. Less naive computation of powers
requires exactly the ceiling of lg
n multiplications and so the every case running
time is in Theta(n^3 lg n).
Floyd's algorithm is faster.
- page 100: The following may help you see why Floyd's
algorithm is correct: The "obvious" way to find
the shortest path is to compute longer and longer paths; that
is, start with D^(0) = W and compute D^(1), D^(2),
... as indicated on pages 97-99. Here the superscript
indicates the maximum possible number of internal vertices. Instead,
we can compute another another array (also called D),
but now the superscript will indicate the maximum possible value
for a vertex subscript. For example, D^(4)[i, j] is the shortest
path from v_i to v_j where the only
vertices allowed between v_i and v_j
are v_1, v_2, v_3,
and v_4. This is what
Floyd's algorithm is doing, where k = 4
in the example I just gave. The algorithm is able to store
the D^(k) arrays
on top of one another because
D^(k)[i, j] =
min(D^(k-1)[i,
j] , D^(k-1)[i,
k] + D^(k-1)[k,
j])
and also
D^(k-1)[i,
k] = D^(k)[i,
k] and D^(k-1)[k, j] = D^(k)[k, j].
Hence it does not matter whether we use D^(k-1) or D^(k)
in the minimum computation in Floyd's algorithm. As with
the "obvious" approach, D^(0) = W because there can
be no intermediate vertices if no subscript can exceed 0.
Chapter
4
- Overview of Greedy: Greedy algorithms are based
on making the best choice at the current stage without regard
for the future. They are usually simpler to visualize
and implement than divide and conquer or dynamic programming
algorithms; however, it is often harder to prove that they are
correct.
- page 144: Prim's algorithm uses the extra data structures
to reduce running time. We could use a simpler approach,
but its running time is of order nm, where m is the number
of edges and so n-1 <= m <= n(n-1)/2,
which can be worse than Prim's algorithm. Here is
the algorithm:
- Start with Y = {v_1}
and F the empty set.
- Examine all edges and select an edge e with one end
in Y and one end not in Y such that the weight
of e is as small as possible. Add e to F
and its ends to Y (one end is already in Y).
- While Y is not all vertices, repeat the previous step.
- page 167: The example becomes worse for the greedy-algorithm
thief if the 10 pound item is worth $75.: The value
of what he steals goes down even though one of
the items he could steal goes up in value!
Chapter 5
- Overview of Backtracking: This is a general
purpose approach. Like most general purpose tools, it is usually
better to use a more specialized tool (Chapters 2-4) if any is
applicable. Running time can be very sensitive to
the order in which choices are made and the nature of the promising( ) function.
Proving correctness is usually fairly simple since one is simply
examining all possibilities except those that have been ruled
out as being impossible. The backtracking may be a source of
programming errors if one is not careful.
- Section 5.3: Monte Carlo methods are used to
estimate the running time of a backtracking algorithm on a specific
problem. This method does not let you determine
the complexity class of the algorithm. Determining
the complexity class of a backtracking algorithm is often quite
difficult and is beyond the scope of this course. On the
other and, complexity class information is usually not of interest
since the running time most backtracking programs increases very
rapidly with problem size. (The growth is frequently
exponential.)
Appendix A
- page 435: induction base is also
called base case, basis, and base
step.
- page 443: There is a simple device for remembering
how to deal with products and quotients of logarithms:
Think of log_a(b) as b/a. You're
allowed to cancel equal factors from numerator and denominator
and to replace ?/(x/y) by ?(y/x),
but do not do any actual division. (Here ? stands
for any expression.) Convert the result back to logarithms. (Factors
of 1 can be dropped and an answer 1/1 means the expression
equals 1.)
Example: Think of log_b(x)/log_b(a)
as (x/b)/(a/b) = x/a,
which corresponds to log_a(x). See
p.443 #7.
Example: Think of ln n as
n/e = (n/2)(2/e),
which corresponds to (ln 2)(lg n). See
p.66.
Example: Think of log_a(b) as
b/a = 1/(a/b), which corresponds
to 1/log_b(a). Thus log_a(b)
= 1/log_b(a).
Example: Think of ln(10)lg(e)log_10(2)
as (10/e)(e/2)(2/10) = 1/1, so
the product of logarithms is 1.
- page 463: The square brackets [ ] in the exercises
for section A.7 should be parentheses -- these are binomial coefficients.
Appendix
B
- page 465: The title of Section B.1 may be somewhat
misleading. What it discusses is proving that the solution
to a recurrence is correct once you have somehow found the solution.
- page 469: The order of a recursion
is the difference between the largest and smallest subscripts
on t. Thus the order of the recursion at the bottom
of the page is k. Such a recursion requires
exactly k initial conditions; that is, the it requires
the first k values of t_i. ("Initial
condition" is defined on page 466.)
- page 472: The order of the characteristic equation
equals the order of the recursion. (See previous note.)
- page 478: Examples B.14 and B.15 may be misleading. If
the recursion has order l (see note for p.469) and the
number of initial conditions exceeds l, only the last
l values should be used. To illustrate, suppose
t_0=1 in Example B.14. The recursion would
give the same values of t_k for k>0, but the
method of solution would not work. The correct approach
is described in Example B.16 on page 481: Use l
initial values of t_k that are given with the problem
and then use the recursion to generate however many more tk
values are needed. For Theorem B.3 on page 479, the
number of additional values that are needed is d+1. To
illustrate: Example B.14 should have only the one initial
condition t_0=0. Since p(n)=1, we
have d=0. Thus the recursion would then be used
to compute one addtional value of t, namely t_1.
- page 486: The method in Section B.3 involves repeated
substitution to convert the recurrence into a summation which
you can evaluate (Ex. B.21) or approximate (Ex. B.22).
- page 490: In the definition of smooth,
2n can be replaced by any kn for any integer
k > 1.
- page 490: The term smooth is unfortunate
since it is misleading and conflicts with the usage in mathematical
analysis. We think of smooth as describing the graph of
the function and so the exponential function 2^n would
be smooth -- but it is not (Ex. B.24). On the
other hand, 2^[lg n] is smooth, but it's graph is not.
([x] is the floor of x) "Smooth" in this
context really means that the function f(n) is
not growing too fast and is not bouncing around. Better
terminology might be moderately growing or almost
polynomial since a smooth function is O(n^k)
for some large enough k and since polynomials are eventually
nondecreasing.
- page 490: Theorem B.4 requires that you solve
the recursion for n a power of b. You can
usually avoid doing this by using Theorem B.6 or the improvement
of it noted below.
- page 493: Theorem B.6 can be improved in two ways:
(a) The initial condition T(s)
= d need not be specified.
(b) cn^k can be replaced by g(n)
in Theta(f (n)) where g(n)
is eventually positive and f(n) is smooth.
Then:
(1) If f(n) = n^k, the conclusions of
the theorem hold.
(2) Otherwise, in (B.5), the first case becomes Theta(f (n)),
the second case is too complicated to state here, and the third
case is unchanged, where we define k to be
the limit of log f (n)/log n
as n goes to infinity.