Data Structures review outline

For each topic the key questions are what?, how?

Readings: See latest info page (Sakai home page) for specific chapter/section links to ODS textbook, notes, Sedgewick slides. Interfaces and their supporting interfaces

Stack: add(x)/remove() or push(x)/pop() (LIFO)
1. ArrayStack or ArrayDeque: amortized worst case O(1).
2. Singly or Doubly linked lists: worst case O(1)
queue: add(x)/remove() or enqueue(x)/dequeue() (FIFO)
1. ArrayDeque: amortized worst case O(1).
2. Singly or Doubly linked lists: worst case O(1)
deque: addFirst(x)/removeFirst(), addLast(x)/removeLast()
1. ArrayDeque: amortized worst case O(1).
2. Doubly linked lists: worst case O(1)
Priority Queue: add(x,p)/extractMin() ...decreaseKey(x,new_p)
1. Binary Heap, worst case O(log(n)) per op.
2. Meldable Heap, expected case O(log(n) per op.
USet: add(x)/remove(x)/find(x). find is exact or fail.
1. hash table (chained or with linear probing)
2. careful choice of hash function is critical to performance
3. O(1) expected performance per op with randomized hash function.
SSet: add(x)/remove(x)/find(x). find is exact or next larger.
Also browsable: first(), next(), prev(), last().
1. Binary Search Tree
2. Plain Search Tree performs well, O(log(n)) per op, on average.
3. Treap has expected performance O(log(n)) per op.
4. 2-3-4 tree has worst case performance O(log(n)) per op.
5. Red-Black tree is best implementation of 2-3-4 idea.
6. use B-tree when disk storage will be involved. (b-1 to 2b-1 keys per node).
List: add(i,x),remove(i), get(i), set(i,x).
1. worst case O(log(n)) per op: Can add List ops to SSet implementations (eg our Treap exercise for get(i)).
2. Worst case is mix of great and horrible: ArrayDeque, good for get, set, bad for add, remove.
Sorting
1. insertionSort: incremental, not scalable, O(n², but fast on very small arrays.
2. Divide and conquer approaches
  1. MergeSort: split in half, recursively sort halves, merge. Worst case O(n*log(n)), not "in place".
  2. QuickSort: partition on a random "pivot", recursively sort smalls and bigs. Expected case O(n*log(n)).
3. HeapSort: (1) build a binary heap (2) extractMin, one by one. In place, O(n*log(n)) worst case.
4. Of the above: best in practice is quickSort.
5. IntrospectiveSort: QuickSort, with InsertionSort under a size threshold and HeapSort over a depth-of-recursion threshold. In place, worst case O(n*log(n)), practical performance of quickSort.
Graph: adjacency matrix or array of adjacency lists.
1. Terminology: vertex, edge, path, directed or undirected, weighted or not, connected. Graph G has V vertices, E edges.
2. SSSP: Single Source Shortest Path problem.
3. Breadth First Search, good for SSSP on unweighted graph, uses queue. Worst case time O(E) on connected graph.
4. Dijkstra's algorithm, good for SSSP on positively weighted graph, uses priority queue. Worst case time O(E*log(V)) on connected graph.
Internal tools used by various data structures.
1. resize(), for dynamic arrays. Double when full, shrink to half when 1/3 full.
2. rotate (left and right)
3. bubble up, trickle down
4. insert
5. partition
6. merge
7. borrow
8. split
9. color flip
10. ...
Tree Invariants
1. Heap property: at every node u, priority of u is less than that of children.
2. BST property: at every node u, value at u is greater than all in left subtree, less than all in right subtree.