------------------------------------------------------------ COP 3530 Data Structures and Algorithms M. Schmalz 990730 ------------------------------------------------------------ FINAL EXAM REVIEW OUTLINE: (consult also the Exam-1 and Exam-2 reviews) For each algorithm we have studied in this class, know: -- What the algorithm does that makes it different from other, similar algorithms (e.g., merge-sort versus quick-sort) EX.: Merge-sort is tree-structured while quick-sort can have an unbalanced tree structured. Both are sorting algorithms, but merge-sort does not have data-dependent complexity, while quick- sort does. -- Know the algorithm's best- and worst-case complexity, and the reason(s) for that complexity. EX.: Quick-sort has complexity O(n log(n)) when the tree is balanced (e.g., when the pivot of each subsequence equals the median of that subsequence), which is the best case. This occurs because there are O(log n) levels in the tree, with O(n) work required per level. The worst case of O(n^2) complexity occurs when the pivot is at the same end of each subsequence, since O(n) levels occur in the search tree, each of which has O(n) work. -- Be able to compare algorithm implementations in terms of complexity *figures*, in a quantitative manner. Q: Compare the complexity of queue insertion implemented with arrays or lists. A: Arrays: The array elements may have to be shifted forward or backward, to compensate for elements removed during previous dequeuing operations. This requires O(n) I/O operations, and two additions, where n denotes the array size. For list implementations of a simple queue, one dequeues from the head of the list and enqueues (inserts) at the rear of the list. Insertion thus requires O(1) list element insertion operations. Thus, the relative speedup of the list imple- mentation over the array implementation is O(n)/O(1) = O(n). FINAL EXAM TOPICS AND HINTS: 1) Study the class notes on selection, in particular: -- why are most non-tree-structured selection algorithms costly (i.e., O(n^2) complexity)? -- why are sorted sequences required for binary search? -- be able to reproduce two selection algorithms in p-code with a simple sketch as to how they work 2) Know your sorting algorithms!!! -- know how each sorting algorithm discussed in class works, to the extent that you can give a sketch of its operation (including p-code) -- be able to diagram the application of each algorithm to an eight-element vector that contains both positive and negative numbers -- be able to compare the complexity of any two or more sorting algorithms, and tell (in 1-2 sentences) what causes this complexity to occur in *each* algorithm -- why are most non-tree-structured sorting algorithms costly (i.e., O(n^2) complexity)? -- be able to compare and contrast the time and space complexities of any two sorting algorithms (space complexity means the amount of storage required -- this is a "thinking question", so check your algorithms to determine how many variables and array elements are actually required to do sorting in each case) -- be able to discuss, in one or two *clear* sentences, the advantages and disadvantages of each sorting algorithm -- finally, know (for each algorithm) the best- and worst-case input configuration, and why it is the best or worst case. Note that some algorithms' best and worst cases are the same, and know which algorithms have this behavior, and why. 3) Know basics of Arrays, Stacks, Queues, Graphs, and Trees -- what are they? -- what do they look like? -- how is the basic data structural element formed, (e.g., structure and parts), and what role does each part play? -- what are the basic ADT operations for a given ADT type? -- what is the complexity of each ADT operation for a given type of ADT representation (e.g., stacks implemented using arrays, or a graph represented by an adjacency list) Note: There may be one or two questions on the complexity of ADT operations for each of these structures, so know them well. 4) Remember the differences between Stacks and Queues, especially how one functions as a FIFO buffer, and the other as a LIFO buffer. -- For example, I might ask you to push values onto a stack, then pop them off, and also enqueue and dequeue the same values. 5) Study the class notes on heaps and heap-sort: -- know how the heap property makes heap-sort work efficiently -- be able to diagram the application of heap- sort to a given sequence of numbers that contains both positive and negative numbers -- be able to discuss, in one or two *clear* sentences, the advantages and disadvantages of heaps and heap-sort -- finally, know (for heap-sort) the best- and worst-case input configuration, and why it is the best or worst case. 6) Graphs and trees -- -- given a sequence of numbers, be able to build a binary tree, heap, or AVL tree from those numbers -- know the differences between the vertex and edge representations of directed and undir- ected graphs -- given a picture or mathematical description of a graph, be able to construct the (a) edge list, (b) adjacency matrix, or (c) adjacency list data structural representation of that graph. -- know the advantages and disadvantages of depth- and breadth-first traversal algorithms Important!! >> given a graph description or picture of a graph, be able to list the edges and vertices in the sequence that they are visited by depth- and breadth-first traversal starting from a given vertex. 7) Spanning Trees, Minimum Spanning Trees, and Shortest- Path Algorithms -- Know each algorithm and what it does, including Prim's, Kruskal's, Warshall's, Floyd's, and Dijkstra's algorithms, which we discussed in class in detail. -- Know the complexity of each algorithm, as well as in what situations and why the algorithm has that particular complexity -- Given the description of a graph, be able to con- struct (a) minimum spanning tree, (b) the shortest path between two specified vertices, and (c) the shortest path from a given vertex to all other vertices. 8) Hashing and Hash Table Operations -- What is a hash table, and what is the difference between open and closed hashing? -- What is a hash function, and why is it useful? -- What is the collision problem in hashing? What are some strategies that help reduce the incidence of collision in closed hashing schemes? -- Know the ADT operations on hash tables, and their complexities, as well as how they are implemented. -- Know what is universal hashing, and how it improves upon the rehashing schemes we discussed for closed hashing. You do not have to know the mathematics in detail or do proofs. 9) Performance Measurement -- What is the difference between clock time, CPU time, and I/O time, and why is it important to measure each of these times? Hint: Think of the differ- ence between compute- and I/O-bound applications. -- Know how to insert timing calls to measure total execution time, overhead due to loops or function calls, and time required to execute a kernel with- in a loop. It was very nice having you as students -- this was one of the best and most attentive classes I have taught. GOOD LUCK ON YOUR EXAM, AND HAVE A PLEASANT INTERSEMESTER BREAK!!! --EOF--