Sorting Algorithm Visualization

Geoffrey Barrett -
Project Image

The inspiration behind this project originated from the fact that when I started developing this website, I had been interviewing for new job opportunities in software engineering.

Often times during the interview process candidates are asked esoteric algorithms questions, and I thought it would be interesting to visualize the differences between various sorting algorithms that are often asked during coding interviews.

The source code can be found in the VizSort repository of my GitHub account.

The algorithms in the VizSort repository has some additional logic to track the sorting state for visualization purposes. Therefore, it is not suggested that this code is used in any production software. Additionally, the code is written in pure Python, making it highly inefficient in comparison to popular third party sorting algorithms.

Bubble Sort

Fig 1.

The Bubble Sort algorithm performs a "pass" on each element until there are no adjacent elements to swap. The visualization is updated on a per pass basis.

Bubble Sort is a comparison-based sorting algorithm, and is the most intuitive to understand using the heatmap depicted in Figure 1. You can see how the algorithm was given it's name as the largest element "bubbles" towards the top of the array (although in this case it is bubbling towards the bottom of the array). The algorithm repeatedly iterates across the elements and swaps adjacent values until all the elements are sorted.

Despite being the most intuitive, Bubble Sort has quite poor performance with an average time-complexity of O(n2) . For comparison, sorters implemented in the popular third-party packages typically have an average time-complexity of O(n log n).

Merge Sort

Fig 2.

Merge Sort algorithm results visualized on a per-divsion basis (starting from the smallest sub-array and ending at the fully sorted array).

Merge Sort is a divide-and-conquer algorithm. The algorithm visualization depicted in Figure 2 recursively divides the elements (of a given row) into sub-arrays until there is a single element in each sub-array. After the division stage, the sub-arrays will be repeatedly merged, producing sorted sub-arrays until the elements have been fully reconstituted as a single array.

The average time-complexity of this algorithm is O(n log n) .

QuickSort

Fig 3.

QuickSort algorithm results visualized on a per-divsion basis (starting from the smallest sub-array and ending at the fully sorted array).

QuickSort is another divide-and-conquer algorithm. Quicksort is marginally faster than the aforementioned Merge Sort algorithm on wider datasets. However, it maintains a similar average time-complexity of O(n log n).

The method visualized here uses the Hoare Partitioning Scheme. As the name suggests, this algorithm involves partitioning (dividing) the data around a pivot (the middle of the sub-array). This algorithm utilizes two pointers (starting at both sides of the sub-array), and swaps elements so that values smaller than the pivot value are on the left side of the array and larger values on the right side of the pivot.

Similar to the Merge Sort algorithm, the partitioning is performed recursively, and then re-constituted to produce the final sorted array.

This sorting algorithm is the default when using numpy.sort() and pandas.DataFrame.sort_values() (which leverages NumPy under the hood).

Algorithm Comparison

Although the visualizations help illuminate the differences between the aforementioned sorting algorithms, it does not fully convey the performance differences between Merge Sort and QuickSort. As previously stated, the QuickSort algorithm is marignally faster than the Merge Sort algorithm, but you might notice from Figures 2 & 3, that it is not so clear.

Number of Divisions

Fig 4.

Comparison of the number of divsions made across the implemented sorting algorithms.

The divide-and-conquer nature of the Merge Sort and QuickSort algorithms made the visualization a bit more complicated. The decision was made to plot on a per-division basis. Updating from the bottom up as the data is reconstituted into the final sorted state.

With the current seed value (42) and dimensions (12 x 10), QuickSort finished 7.14% faster than Merge Sort (QuickSort: 4.67 divisions ± 1.11, Merge Sort: 5.00 divisions ± 0.00). However, 16.66% (2) of the rows in the QuickSort algorithm performed slower than Merge Sort.

Since the Bubble Sorter algorithm is not a divide-and-conquer algorithm, the number of divisions has been set to 0.

Number of Comparison

Fig 5.

The number of comparisons made across the implemented sorting algorithms.

The number of divisions does not tell the whole story. Another import factor in performance is the number of comparisons made by the algorithms.

As seen in Figure 5, with the current seed value, and dimenions, the QuickSort algorithm makes 90.65% fewer comparisons than the Merge Sort algorithm (QuickSort: 17.83 comparisons ± 1.62, Merge Sort: 34.00 comparisons ± 0.00). Providing further justification for why the aforementioned popular third party packages leverage the QuickSort algorithm as their default method for sorting.

References

  1. Barrett, G. M. (2023, July 24). GeoffBarrett/SortViz: Visualizing Various Sorting Methods. GitHub. https://github.com/GeoffBarrett/SortViz

  2. GeeksforGeeks. (2023, July 12). Quicksort - data structure and algorithm tutorials. GeeksforGeeks. https://www.geeksforgeeks.org/quick-sort

  3. NumFOCUS, Inc. (n.d.). Pandas.dataframe.sort_values. pandas.DataFrame.sort_values - pandas 2.0.3 documentation. https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.sort_values.html

  4. NumPy. (n.d.). Numpy.sort. numpy.sort - NumPy v1.25 Manual. https://numpy.org/doc/stable/reference/generated/numpy.sort.html

  5. Wikimedia Foundation. (2023a, June 26). Quicksort. Wikipedia. https://en.wikipedia.org/wiki/Quicksort

  6. Wikimedia Foundation. (2023b, June 27). Merge sort. Wikipedia. https://en.wikipedia.org/wiki/Merge_sort

  7. Wikimedia Foundation. (2023c, July 8). Bubble sort. Wikipedia. https://en.wikipedia.org/wiki/Bubble_sort