Sorting lists of items into order is an important problem in computer science. What most people don’t realise is that there are many possible ways to do this.

## Big O Notation

But before we look at individual algorithms, I need to explain big O notation. Big O notation is a measure of the efficiency or power of a particular algorithm. If an algorithm is *O*(*n*) then increasing the number of items to be sorted from 10 to 100 also increases the time taken by the same factor of ten. But if an algorithm is *O*(log(*n*)) then increasing the number of items to be sorted from 10 to 100 only doubles the time taken, as log(10) is 1 and log(100) is 2.

## Bubble Sort

The bubble sort is the beginners sorting algorithm. It works by comparing each consecutive pair of items in the list and swapping them if they are the wrong way around.

Imagine that we are sorting the following six items: 536, 362, 653, 937, 351, 331.

The first operation compares 536 and 362 and because 362 is less than 536 the two items are swapped. The second operation compares 536 and 653, and because they are already the correct way around no swap takes place.

- 536, 362, 653, 937, 351, 331.
- 362, 536, 653, 351, 331, 937.
- 362, 536, 351, 331, 653, 937.
- 362, 351, 331, 536, 653, 937.
- 351, 331, 362, 536, 653, 937.
- 331, 351, 362, 536, 653, 937.

At this point, after five run-throughs and a total of ten swaps (but *many* more comparisons) the list is ordered.

The bubble sort is, on average, *O*(*n*^{2}). This means that if the number of items increases by a factor of ten from 10 to 100 then the time taken increases by a factor of one hundred. The position of items in the list is particularly important in the bubble sort. Large items at the start of the list (rabbits) quickly move to the end of the list, but small items at the end of the list (turtles) take much longer to move to the correct position.

A modification of the bubble sort, known as the *cocktail shaker sort*, aims to reduce the effect of turtles by alternating direction on each run, running forwards and then backwards. For this reason it is sometimes known as the *bidirectional bubble sort*.

Running the list above through the cocktail shaker sort requires one fewer run than the bubble sort, because 331 acts as a turtle in the previous example:

- 536, 362, 653, 937, 351, 331.
- 362, 536, 653, 351, 331, 937.
- 331, 362, 536, 653, 351, 937.
- 331, 362, 536, 351, 653, 937.
- 331, 351, 362, 536, 653, 937.

## Insertion Sort

When human beings sort items, such as playing cards, they usually perform something close to an insertion sort. In the insertion sort each item is “removed” from the list and inserted into its correct place in a new output list.

Consider the same six numbers as in the previous example: 536, 362, 653, 937, 351, 331. In the first operation 362 is removed from the list and inserted before 536, as it is lower. In the second operation 653 is removed from the list but as it is larger than 536 it remains in place.

- 536.
- 362, 536.
- 362, 536, 653.
- 362, 536, 653, 937.
- 351, 362, 536, 653, 937.
- 331, 351, 362, 536, 653, 937.

Like the bubble sort, the insertion sort is also O(n^{2}), making it (again, like the bubble sort) a fairly ineffective way of sorting a list.

## Bogosort

Bogosort (short for “bogus sort”) is the idiot brother of the sorting algorithm world. Running a bogosort is very simple: take the list, randomise it and check if it’s sorted. If it isn’t, repeat the process until it is.

Bogosort is incredibly ineffective. I ran a six-item test ten thousand times and bogosort only correctly sorted the list thirteen times. Bogosort is *O*(*n*×*n*!) which means that increasing the number of items to be sorted from 10 to 100 increases the time taken by a factor of 2.57×10^{152}. For those not familiar with standard form, 2.57×10^{152} is a very large number, 153 digits long. If you built a computer that could do one trillion bogosorts per second, and set one trillion of those computers working simultaneously on a 100-item list it would still take longer than the age of the Universe to properly sort the list. (Of course it’s possible that bogosort would get it correct on the first try, but there’s a better chance of you winning the lottery every week for the rest of your life.)

A modification of bogosort called *bozosort* randomises two items at a time rather than the entire list. It is *O*(*n*!) which means that it is still incredibly inefficient.

## Quicksort

Quicksort is the first of the algorithms described here that is actually used in real applications. It was invented in 1960 by Sir Tony Hoare.

Quicksort is *O*(*n*×log(*n*)) which means that going from 10 items to 100 items increases the time taken by a factor of twenty, but going from 100 to 1000 items increases the time by a factor of fifteen, and this factor gets smaller and smaller as the number of items in the list gets larger.

Quicksort is a “divide and conquer” algorithm. It works by picking an element (called a *pivot*) and then reordering the list so that numbers less than the pivot come before it and numbers larger than it come after it, a process called *partitioning*. These two lists are then split in two using another pivot and the process is repeated recursively until the list is fully sorted.

For example, consider our example list: 536, 362, 653, 937, 351, 331.

We pick a number from the list to act as a pivot. There are a number of ways to decide on a pivot: choosing a number at random, or the item in the middle position is quickest, whereas using the median value works better but requires more time to calculate. A common compromise is to find the median of the first, middle and last item.

We pick 362 as our pivot, and select the first and last items in the list. The first-most selector moves forward until it finds an item larger than the pivot, and the last-most selector moves backward until it finds an item smaller than the pivot. These numbers are then swapped.

- 536, 362, 653, 937, 351, 331.
- 331, 362, 653, 937, 351, 536.

This process is then repeated, until both selectors have selected the same item, which will always be the pivot itself.

- 331, 362, 653, 937, 351, 536.
- 331, 351, 653, 937, 362, 536.
- 331, 351, 362, 937, 653, 536.

At this point all the numbers smaller than the pivot are on one side, and all the numbers larger than the pivot are on the other side; this means that the pivot is in the correct place in the list and can be ignored from now on. The remaining items on the list are then split into two smaller sub-lists either side of the pivot, and the process is repeated using new pivots.

- 331, 351, [362], 937, 653, 536.
- 331, 351, [362], 536, 653, 937.

At this point the list is sorted, so no new partitioning is required and the process halts.

## Mergesort

Mergesort is another divide and conquer algorithm, and invented by legendary computer scientist John von Neumann in 1945. Like Quicksort it is also *O*(*n×*log(*n*)) but it requires more resources to run.

Mergesort operates by splitting the list of *n* items into *n* sub-lists and then merging these small sub-lists into larger sorted sub-lists and then repeating this process.

It’s easier to explain Mergesort if the number of items in our lists is a power of two, so for this example we will add two new numbers to our example list to make: 536, 362, 653, 937, 351, 331, 696, 908.

In the first step our example list is split into eight sub-lists, each containing one item. Then these are combined into four sub-lists, each containing two items.

- [536], [362], [653], [937], [351], [331], [696], [908].
- [362, 536], [653, 937], [331, 351], [696, 908].

These lists are then merged until only one fully sorted list remains.

- [362, 536, 653, 937], [331, 351, 696, 908].
- [331, 351, 362, 536, 653, 696, 908, 937].

Unlike Quicksort, Mergesort is equally fast whether the items are randomly sorted, nearly in the right order, in reverse order, etc.

## Timsort

Timsort is a hybrid algorithm combining merge and insertion sorts. It was invented in 2002 by Tim Peters for use in the Python programming language.

Timsort begins by looking for small runs (called *miniruns*) of data that is already sorted (either small to large or vice versa). If no runs of a long-enough length (minirun is short for “minimum run”) are found then an insertion step takes place to create them. These miniruns are sorted using an insertion sort as explained above, and then combined with each other using Mergesort, again as explained above. The trick of Timsort is how it selects and creates miniruns, but the complexity of this process is such that it is beyond the scope of this post.

The following explanation using our example list isn’t a perfect explanation of Timsort, but it will give you a good idea of how the process works in actuality.

- 536, 362, 653, 937, 351, 331.
- [536], [362, 653, 937], [351], [331].
- [362, 536, 653, 937], [331, 351].
- [331, 351, 362, 536, 653, 937].

If we mandate a minirun length of three then the process is different.

- 536, 362, 653, 937, 351, 331.

At this point [362, 653, 937] is identified as a minirun, but it’s the only one. Timsort therefore creates another minirun using an insertion sort on 536, 351 and 331.

- [362, 653, 937], [331, 351, 536]
- [331, 351, 362, 536, 653, 937].

Timsort works very well with real life data and is *O*(*n×*log(*n*)) in even the worst of cases and has therefore become very popular: as well as in Python it is also the standard algorithm in the Java and GNU Octave programming languages and on the Android platform.

## Heapsort

Heapsort is another commonly used algorithm, invented by J.W.J. Williams in 1964. It is *O*(*n×*log(*n*)) in even the worst of cases, and competes mainly with Quicksort: Heapsort is usually slower than Quicksort but in a worst-case scenario Quicksort is *O*(*n*^{2}) which compares unfavourably with Heapsort.

Heapsort works by first building a *heap* from the list and then adding the largest item from the heap to the output list. The heap is then reconstructed and the process repeats until a fully sorted list has been produced. A heap is a type of binary tree, where each each item creates two branches holding two new items which each create two new branches and so on. In creating a heap from our longer example list (536, 362, 653, 937, 351, 331, 696, 908) the process would go as follows:

- [536].
- [536], [362.
- [536], [362, 653].

With 362 and 653 being on the branches emanating from 536. At this point, because one of the items is larger than the item at the top of the tree, the heap is reorganised so that it looks like:

- [653], [362, 536].

And we start adding more numbers from our list, rebuilding the heap as we go.

- [653], [362, 536], [937

Because 937 is larger than its parent (362) it would swap places with 362, and then because it also larger than 362’s old parent (653, its new parent) it would also swap places with that so that our list becomes:

- [653], [937, 536], [362.
- [937], [653, 536], [362.

We now add the next numbers, follow the process already outlined as we do.

- [937], [653, 536], [362, 351, 331, 696].

Here 653 is the parent of 362 and 351, and 536 is the parent of 331 and 696, so another swap takes place and the process continues.

- [937], [653, 696], [362, 351, 331, 536].
- [937], [653, 696], [362, 351, 331, 536], [908.
- [937], [653, 696], [908, 351, 331, 536], [362.
- [937], [908, 696], [653, 351, 331, 536], [362.

At this point, every number in the heap is lower than its parent number and the contents of the heap are read into a temporary list: 937, 908, 696, 653, 351, 331, 536, 362. The number at the top of the heap must be the largest number, so it is moved to the end of the final output list and it is replaced by the number at the bottom of the heap so that the heap can be rebuilt, thereby moving the next-largest number to the top.

- [362], [908, 696], [653, 351, 331, 536].
- [908], [362, 696], [653, 351, 331, 536].
- [908], [653, 696], [362, 351, 331, 536].

At this point the heap’s contents are read into another temporary list: 908, 653, 696, 362, 351, 331, 536 and the largest number (908) is added to the second-last position in the final output list and the process repeats until a completely sorted list is created.

- [536], [653, 696], [362, 351, 331.
- [653], [536, 696], [362, 351, 331.
- [696], [536, 653], [362, 351, 331.
- [331], [536, 653], [362, 351.
- [536], [331, 653], [362, 351.
- [653], [331, 536], [362, 351.
- [653], [362, 536], [331, 351.
- [351], [362, 536], [331.
- [362], [351, 536], [331.
- [536], [351, 362], [331.
- [331], [351, 362].
- [362], [351, 331].
- [331], [351.
- [351], [331.
- [331].

And now the list is completely sorted: 331, 351, 362, 536, 653, 908, 937, 969.

## The Best Sorting Algorithm

There is no one best sorting algorithm. The choice of algorithm will depend on the resources available, the type of data and how sorted it already is, and so on.

This post was inspired by the video below that “sonifies” fifteen different sorting algorithms:

The october issue of ACM Communication has an interesting article on robustness, ‘Beyond Efficiency’. Bubble Sort is given as an example of a robust algorithm when comparisons may fail occasionally