By the time a computer scientist gets through university (s)he hears about sorting algorithms like a dozen times. Probably (s)he will also learn that a good sorting algorithm - unlike bubble sort - should run in O(n log n) time. And yet once in a while one still runs into a hard coded bubble sort in production code. When that happens to me, I just silently and modestly raise my eyebrows, while the physiological equivalent of a painful and violent shriek crosses my mind. And I'm not the only one who feels like that! It gets even more annoying when the language itself has an efficient sort implementation.
Anyway... I had some free time on my hands, and I thought it would be nice to see: exactly how bad is bad. I implemented 9 different popular and less popular algorithms for sorting, and tested them on the same dataset. I also added the STL sort implementation as a reference. I used g++ as a compiler (with -O3) on an Ubuntu 11.10 box. The cpu is an AMD Athlon 64 X2 Dual Core Processor 4200+@ 1Ghz. Data reading time is not included, running times were measured using the boost::timer library. You can download the sort implementations here: sort.h The functions are similar in design to the STL sort, but < is hard coded into them. I didn't care much about optimizing the code by hand, so don't expect miracles.
To see what these asymptotic behaviors look like, I repeated the experiments on increasingly larger data sets. Each algorithm had 1 second to finish the task, and 1,000,000 random integers was the largest dataset I feed them with. Here is a nice plot as a summary:
Quick conclusion: O(n2) algorithms exceeded the 1 second limit even before O(n log n) times got measurably higher than 0. Even among those bubble is the worst. After that comes selection sort and insertion sort. Of course this is implementation dependent, since all three are quadratic algorithms. Never the less, bubbles simplicity didn't even make it the fastest in it's own league.
The forth algorithm - patience - is a more complex algorithm, that first searches for a partition of monotone series by a greedy approach, and then performs a multi-way merge sort. The implementation of patience was taken from Wikipedia.
Obviously enough the best performance algorithms are all O(n log n) in average. Among those heap sort has a guaranty for worst case O(n log n), but runs a bit slower than the rest, although I used the O(n) algorithm for heap construction. Comb sort is a very interesting improvement on bubble sort, that also has O(n log n) worst case complexity. Merge sort finished 3rd. No wonder it is the choice of algorithm whenever the data doesn't fit into memory: it's easy to implement as a parallel algorithm, can be efficient with external storage, and performs pretty well even among other asymptotically optimal algorithms. The first and second place is basically a tie. As far as I know the STL implementation that comes with g++ uses quick sort, so it is not really surprising, that those two ended up side by side. In spite of it's quadratic worst case time complexity, quick sort does make up to its name in the average.
I was also thinking about implementing some more complex algorithms - like smooth sort or Shell sort - but eventually I decided to postpone that for now. To be honest, the list of sorting algorithms is so long, that one could keep coding far days, before having all of them ready for use. 🙂
The analysis above is in no way conclusive. I already mentioned that running times may depend on actual implementation, but that can only change the order among algorithms with the same asymptotic behavior. Some algorithms however are able to recognize regions that are already sorted, and take advantage of that feature. Most notably Tim sort, patience sort and smooth sort are such algorithms. Since I used a random data set, this is not visible on the plot above, so on real world data these algorithms may outperform the others.