heapSort, quickSort, heapSort, insertionSort, introspectiveSort

On deck: insertionSort, the grand synthesis: introspectiveSort

heapSort summary: You can sort with heap operations.

phase 1: unsorted to heap. This can be done in O(n) time with the bottom up strategy.
phase 2: heap to sorted. This can be done in O(n log(n)) time. It is essentially just n heap remove() operations.

mergeSort summary: Divide and conquer.

Sort halves recursively.
merge the halves.

Let T(n) be the worst case time for mergeSort T(n) = 2T(n/2) + O(n) implies T(n) is O(n log(n)).

quickSort summary: Divide and conquer.

partition based on a random pivot, hope for about equal smalls set and bigs set.
sort the smalls and sort the bigs.

Choosing a pivot is like choosing a root for a BST.

Thus the sequence of pivots is like construction of a random BST.

Expected height of random BST = expected depth of recursion in quickSort.

Total cost at each level of recursion is at most O(n).

Thus quickSort has expected runtime O(n log(n)).
[ log(n) levels times n work at each level. ]

insertionSort: not in ODS, but all the sorting algorithms as shown here are in Sorters.h.

template< typename T >
void insertionSort(T *a, int n) {
  // Sort the array a.  
  // Operator< must be defined for type T (i.e. "a < b" is valid for T a, b).

  for (int i = 1; i < n; ++i)
    insert(a, i);
}

template< typename T >
inline int insert(T *a, int i) {
  // Assume a[0..i-1] is sorted. Insert a[i] so that a[0..i] is sorted.
  // Just in case it is of interest, return j such that a[i] ended up at a[j].

  int j = i;
    while(j > 0 and a[j] < a[j-1]) {
      swap(a[j], a[j-1]);
      j--;
    }
    return j;
}

Insert runs in O(i) time.

InsertionSort calls insert for i in 1..n-1. Worst case cost is

n-1

∑ O(i) = O(n²).

i=1

...But it is very fast for very small n.

mergeSort is described in ODS Chapter 11.1.1.
It is implemented in udods/Algorithms.h.

Illustration of mergeSort: detail from example in text.

  13, 8, 5, 2, 4, 0, 6    // a0  from text
  /  /  /    \  \  \  \   
 13  8  5                 // a00
 /    \  \
13                        // a000 - done
       8  5               // a001
      /    \             
     8                    // a0010 - done
            5             // a0011 - done
       5  8               // a001 - merge of a0010 and a0011
  5  8 13                 // a00  - merge of a000 and a001
            2, 4, 0, 6    // a01
           /  /    \  \
          2  4            // a010
         /    \
         2                // a0100 - done         
               4          // a0101 - done         
          2  4            // a010 - merge of a0100 and a0101
                    0, 6  // a011
                   /    \
                  0       // a0110 - done         
                        6 // a0111 - done         
                    0, 6  // a011 - merge of a0110 and a0111
            0, 2, 4, 6    // a01 - merge of a010 and a011
  0, 2, 4, 5, 6, 8, 13    // a0 - merge of a00 and a01
  0, 2, 4, 5, 6, 8, 13    // a is merge of this and 1,3,7,9,10,11,12

Illustration of mergeSort with added base case for n == 2.

  13, 8, 5, 2, 4, 0, 6    // a0  from text
  /  /  /    \  \  \  \   
 13  8  5                 // a00
 /    \  \
13                        // a000 - done
       8  5               // a001
       5  8               // a001 - done (with swap)
  5  8 13                 // a00  - merge of a000 and a001
            2, 4, 0, 6    // a01
           /  /    \  \
          2  4            // a010
          2  4            // a010 - done with no swap
                    0, 6  // a011
                    0, 6  // a011 - done with no swap
            0, 2, 4, 6    // a01 - merge of a010 and a011
  0, 2, 4, 5, 6, 8, 13    // a0 - merge of a00 and a01
  0, 2, 4, 5, 6, 8, 13    // a is merge of this and 1,3,7,9,10,11,12

Alternate merge function:

template<class T>
void merge(array<T> &a0, array<T> &a1, array<T> &a) {
    int i = 0, i0 = 0, i1 = 0;
l1: while (i0 < a0.length and i1 < a1.length) 
        if (compare(a0[i0], a1[i1]) < 0)
            a[i++] = a0[i0++];
        else
            a[i++] = a1[i1++];
l2: while (i0 < a0.length)
        a[i++] = a0[i0++];
l3: while (i1 < a1.length)
        a[i++] = a1[i1++];
}

Notes:

Only one of l2 and l3 will be executed.
Typically l2 or l3 will copy a small number of items.
Block copy could be used (memcopy).

Speeding up mergeSort: See 3 versions of mergeSort in Sorters.h. Run demo in merge-insertSort.cpp.

quickSort is described in ODS Chapter 11.1.2.
It is implemented in udods/Algorithms.h. (but note we will do it a bit differently.)


template 
int partition(T *a, int n);

template <typename T>
void quickSort(T *a, int n) { // sort a[0..n-1) into increasing order.
    if (n < 2) return;

    // place random entry (call it "pivot") in first position
    swap(a[0], a[rand()%n]);
    // partition based on that entry
    int i = partition(a, n);
    // now a[0..i-1] are <= pivot and a[i..n-1] are >= pivot

    quickSort(a, i); // sort the i smaller elts.
    quickSort(a+i, n-i); // sort the n-i larger elts.
}

template <typename T>
int partition(T *a, int n){
    int i = 0; j = n-1;
    // pivot is a[i]
    while (i < j) {
      while (compare(a[i],a[j] < 0)  --j; 
      swap( a[i++], a[j] ); // now pivot is a[j]
      if (i == j) break;
      while (compare(a[i],a[j]) < 0) ++i;
      swap( a[i], a[j--] ); // now pivot is back to a[i]
    }
    return i; // location of pivot.
}

Illlustration of partition, a from text, n = 14

  13, 8, 5, 2, 4, 0, 6, 9, 7, 3,12, 1,10,11    
   *
   9, 8, 5, 2, 4, 0, 6,13, 7, 3,12, 1,10,11    // random swap

partition(a,14)

   -                                *  +  +
   1, 8, 5, 2, 4, 0, 6,13, 7, 3,12, 9,10,11    // 1st swap
   -  -  -  -  -  -  -  *           +  +  +
   1, 8, 5, 2, 4, 0, 6, 9, 7, 3,12,13,10,11    // 2nd swap
   -  -  -  -  -  -  -  -     *  +  +  +  +
   1, 8, 5, 2, 4, 0, 6, 3, 7, 9,12,13,10,11    // 3rd swap
   -  -  -  -  -  -  -  -  -  *  +  +  +  +
   1, 8, 5, 2, 4, 0, 6, 3, 7, 9,12,13,10,11    // return i = 10 
  /  /  /  /  /  /  /  /  /   \  \  \  \  \
  1, 8, 5, 2, 4, 0, 6, 3, 7,   9,12,13,10,11

quickSort first part

   1, 8, 5, 2, 4, 0, 6, 3, 7                 
   *
   4, 8, 5, 2, 1, 0, 6, 3, 7                  // random swap

partition(a,9)

   -                    *  +
   3, 8, 5, 2, 1, 0, 6, 4, 7                  // 1st swap
   -  *                 +  +
   3, 4, 5, 2, 1, 0, 6, 8, 7                  // 1st swap
   -  -           *  +  +  +
   3, 0, 5, 2, 1, 4, 6, 8, 7                  // 3nd swap
   -  -  *        +  +  +  +
   3, 0, 4, 2, 1, 5, 6, 8, 7                  // 4th swap
   -  -  -     *  +  +  +  +
   3, 0, 1, 2, 4, 5, 6, 8, 7                  // 5th swap
   -  -  -  -  *  +  +  +  +
   3, 0, 1, 2, 4, 5, 6, 8, 7                  // return i = 4
  /  /  /  /    \  \  \  \  \
 3, 0, 1, 2,     4, 5, 6, 8, 7

Quick sort is a randomized algorithm (choice of pivot for partitioning is random). It runs in expected time O(n log(n)). quickSort can be effectively combined with insertionSort. The point is that insertionSort, while disastrous on large arrays, is the fastest for small arrays. The strategy of quickinsSort is to adjust quickSort to take advantage of this.

template <typename T>
void quickSortT(T *a, int n) { // sort a[0..n-1] into increasing order.
    if (n < THRESHOLD) { 
        insertionSort(a, n); 
        return; 
    }

    // place random entry in first position
    swap(a[0], a[rand()%n]);
    // partition based on that entry
    int i = partition(a, n);

    quickSortT(a, i); // sort the smaller elts.
    quickSortT(a+i, n-i); // sort the larger elts.
}

Like quickSort, quickSortT runs in O(n log(n)) expected time, but is sped up by a constant factor by use of insertionSort on small segments of the array.

Speeding up quickSort: See 3 versions of quickSort in Sorters.h. Run demo in quick-insertSort.cpp.

clicker questions.

Which divide and conquer sorting algorithm begins with two recursive calls and follows up by combining the two sorted bits appropriately?
1. heapSort
2. insertionSort
3. mergeSort
4. quickSort
Which divide and conquer sorting algorithm ends with two recursive calls after first dividing up the array into two appropriate bits?
1. heapSort
2. insertionSort
3. mergeSort
4. quickSort
Which two sorting algorithms can be combined effectively, even though one of them runs in the horrible time O(n^2)?
1. heapSort and insertionSort
2. insertionSort and mergeSort
3. mergeSort and quickSort
4. quickSort and insertionSort
Which sorting algorithm is not "in place", that is, it requires allocation of additional array(s) for it's operation.
1. heapSort
2. insertionSort
3. mergeSort
4. quickSort

introspectiveSort: (History -- it is a lot newer than the others) Sorters.h

Is like quickSort, but use insertionSort for the smaller parts and revert to heapSort if recursion depth gets too deep.

...so is like quickSortT with reverting to heapSort.

Result: An in-place sorting algorithm with worst case runtime of O(n log(n)) and expected case runtime of quickSortT. ...not just O(n log(n)) expected time, but the actual fast time of quickSortT.

Sorter	in place	asymptotics	pragmatics
insertionSort	yes	awful:(O(n²))	fast for small n
mergeSort	no	worst case O(n log(n))	near best, has special uses
quickSort	yes	expected case O(n log(n))	best in practice, give or take
heapSort	yes	worst case o(n log(n))	factor of two or three worse than best
introspectiveSort	yes	worst case O(n log(n))	best in practice, give or take