Data Structures review outline
For each topic the key questions are what?, how?
Readings: See latest info page (Sakai home page) for specific chapter/section links to ODS textbook, notes, Sedgewick slides.
Interfaces and their supporting interfaces
- Stack: add(x)/remove() or push(x)/pop() (LIFO)
- ArrayStack or ArrayDeque: amortized worst case O(1).
- Singly or Doubly linked lists: worst case O(1)
- queue: add(x)/remove() or enqueue(x)/dequeue() (FIFO)
- ArrayDeque: amortized worst case O(1).
- Singly or Doubly linked lists: worst case O(1)
- deque: addFirst(x)/removeFirst(), addLast(x)/removeLast()
- ArrayDeque: amortized worst case O(1).
- Doubly linked lists: worst case O(1)
- Priority Queue: add(x,p)/extractMin() ...decreaseKey(x,new_p)
- Binary Heap, worst case O(log(n)) per op.
- Meldable Heap, expected case O(log(n) per op.
- USet: add(x)/remove(x)/find(x). find is exact or fail.
- hash table (chained or with linear probing)
- careful choice of hash function is critical to performance
- O(1) expected performance per op with randomized hash function.
- SSet: add(x)/remove(x)/find(x). find is exact or next larger.
Also browsable: first(), next(), prev(), last().
- Binary Search Tree
- Plain Search Tree performs well, O(log(n)) per op, on average.
- Treap has expected performance O(log(n)) per op.
- 2-3-4 tree has worst case performance O(log(n)) per op.
- Red-Black tree is best implementation of 2-3-4 idea.
- use B-tree when disk storage will be involved. (b-1 to 2b-1 keys per node).
- List: add(i,x),remove(i), get(i), set(i,x).
- worst case O(log(n)) per op: Can add List ops to SSet implementations (eg our Treap exercise for get(i)).
- Worst case is mix of great and horrible: ArrayDeque, good for get, set, bad for add, remove.
- Sorting
- insertionSort: incremental, not scalable, O(n2, but fast on very small arrays.
- Divide and conquer approaches
- MergeSort: split in half, recursively sort halves, merge. Worst case O(n*log(n)), not "in place".
- QuickSort: partition on a random "pivot", recursively sort smalls and bigs. Expected case O(n*log(n)).
- HeapSort: (1) build a binary heap (2) extractMin, one by one. In place, O(n*log(n)) worst case.
- Of the above: best in practice is quickSort.
- IntrospectiveSort: QuickSort, with InsertionSort under a size threshold and HeapSort over a depth-of-recursion threshold. In place, worst case O(n*log(n)), practical performance of quickSort.
- Graph: adjacency matrix or array of adjacency lists.
- Terminology: vertex, edge, path, directed or undirected, weighted or not, connected.
Graph G has V vertices, E edges.
- SSSP: Single Source Shortest Path problem.
- Breadth First Search, good for SSSP on unweighted graph, uses queue. Worst case time O(E) on connected graph.
- Dijkstra's algorithm, good for SSSP on positively weighted graph, uses priority queue. Worst case time O(E*log(V)) on connected graph.
- Internal tools used by various data structures.
- resize(), for dynamic arrays. Double when full, shrink to half when 1/3 full.
- rotate (left and right)
- bubble up, trickle down
- insert
- partition
- merge
- borrow
- split
- color flip
- ...
- Tree Invariants
- Heap property: at every node u, priority of u is less than that of children.
- BST property: at every node u, value at u is greater than all in left subtree, less than all in right subtree.