Radix sort is a highly efficient, non-comparison-based sorting algorithm that works by sorting digits of numbers in a specific order. It’s particularly useful for sorting large sets of integers or strings and can be faster than comparison-based algorithms like quick sort when the input size and range are large. In this blog post, we’ll explain how radix sort works, provide a Python implementation, and explore why it’s an excellent choice for specific use cases.
What is Radix Sort?
Radix sort is a non-comparison-based sorting algorithm that processes the digits (or characters) of the numbers one at a time. The sorting begins with the least significant digit (LSD) or most significant digit (MSD) and uses a stable subroutine, such as counting sort, to sort the digits in each pass. Radix sort is often used to sort large sets of integers, dates, or strings, where it can achieve linear time complexity under certain conditions.
How Radix Sort Works:
- Start by sorting the numbers based on their least significant digit (LSD).
- Move to the next digit (second least significant) and sort the array again.
- Repeat this process for all digits, from least significant to most significant.
- After processing all digits, the array is fully sorted.
Time Complexity:
- Best Case: O(n * k), where
n
is the number of elements andk
is the number of digits. - Worst Case: O(n * k)
Space Complexity:
- O(n + k), where
n
is the number of elements, andk
is the range of the processed digits.
Radix Sort Algorithm in Python
Radix sort typically uses counting sort as its stable sorting subroutine to handle sorting the digits. Here’s a Python implementation of radix sort:
# Helper function: Counting sort to sort array by the digit represented by exp def counting_sort_by_digit(arr, exp): n = len(arr) output = [0] * n # Output array to store sorted numbers count = [0] * 10 # There are 10 possible digits (0-9) # Store count of occurrences of each digit for i in range(n): index = (arr[i] // exp) % 10 count[index] += 1 # Modify count to store actual positions of digits in output array for i in range(1, 10): count[i] += count[i - 1] # Build the output array by placing numbers in their correct position for i in range(n - 1, -1, -1): index = (arr[i] // exp) % 10 output[count[index] - 1] = arr[i] count[index] -= 1 # Copy the sorted numbers back to the original array for i in range(n): arr[i] = output[i] # Main radix sort function def radix_sort(arr): # Find the maximum number to determine the number of digits max_val = max(arr) # Apply counting sort to sort based on each digit, from least significant to most significant exp = 1 # Initial exponent (1s place) while max_val // exp > 0: counting_sort_by_digit(arr, exp) exp *= 10 # Move to the next digit place (10s, 100s, etc.) # Example usage: data = [170, 45, 75, 90, 802, 24, 2, 66] radix_sort(data) print("Sorted array:", data)
Output:
Sorted array: [2, 24, 45, 66, 75, 90, 170, 802]
Step-by-Step Explanation of Radix Sort
To understand how radix sort works, let’s break down the sorting process for the array [170, 45, 75, 90, 802, 24, 2, 66]
step by step.
Step 1: Sort by Least Significant Digit (LSD)
In the first pass, we will sort the numbers based on their least significant digit (1s place). This means that we will compare the last digits of the numbers and sort accordingly.
Original array: [170, 45, 75, 90, 802, 24, 2, 66]
Here’s how the sorting works:
- 170 (0), 45 (5), 75 (5), 90 (0), 802 (2), 24 (4), 2 (2), 66 (6)
- Sort by these digits:
[170, 90, 802, 2, 24, 45, 75, 66]
After sorting by the 1s digit, the array becomes:
Array after sorting by 1s digit: [170, 90, 802, 2, 24, 45, 75, 66]
Step 2: Sort by 10s Digit
Next, we move on to the 10s digit (second least significant digit) and sort the array based on these digits:
- 170 (7), 90 (9), 802 (0), 2 (0), 24 (2), 45 (4), 75 (7), 66 (6)
- Sort by these digits:
[802, 2, 24, 45, 66, 75, 170, 90]
After sorting by the 10s digit, the array becomes:
Array after sorting by 10s digit: [802, 2, 24, 45, 66, 75, 170, 90]
Step 3: Sort by 100s Digit
Now, we sort by the 100s digit (third least significant digit). For numbers that do not have 100s digits, we consider them as having a leading zero (0
).
- 802 (8), 2 (0), 24 (0), 45 (0), 66 (0), 75 (0), 170 (1), 90 (0)
- Sort by these digits:
[2, 24, 45, 66, 75, 90, 170, 802]
After sorting by the 100s digit, the array becomes:
Array after sorting by 100s digit: [2, 24, 45, 66, 75, 90, 170, 802]
Final Sorted Array:
After sorting by all digits, we get the fully sorted array:
[2, 24, 45, 66, 75, 90, 170, 802]
What Happens After the Code Runs?
After the radix_sort() function runs, the array is fully sorted. Here’s a breakdown of what happens at each stage of the code:
- Find the Maximum Value:
- The function first determines the maximum value in the array (
max_val = 802
). This is important because it tells us how many digits the largest number has, so we know how many passes (or digit-based sorts) we need to perform.
- The function first determines the maximum value in the array (
- Counting Sort by Each Digit:
- The function starts sorting the array based on the least significant digit (1s place). After each pass, it moves to the next digit (10s, 100s, etc.) until all digits are sorted.
- In each pass, counting sort is used to ensure that the digits are sorted while maintaining the stability of the sort (i.e., preserving the relative order of numbers with the same digits).
- Exponent Multiplies:
- The variable
exp
controls which digit we are sorting by (1 for 1s place, 10 for 10s place, 100 for 100s place). After each pass,exp
is multiplied by 10 to move to the next higher place value.
- The variable
Is Radix Sort a Stable Sorting Algorithm?
Yes, radix sort is a stable sorting algorithm because it preserves the relative order of equal elements. Stability is guaranteed because counting sort, which is used as the subroutine, is itself stable. This means that numbers with the same digit in one place will maintain their order relative to each other when sorting by another place.
Why Stability is Useful:
Stability is particularly important in radix sort because we sort the numbers multiple times by different digits. If we didn’t use a stable sort for each digit, the relative order of numbers with the same digit could be lost, leading to incorrect results.
For example:
- If you’re sorting dates by day, month, and year, stability ensures that when sorting by month and year, the relative order of dates with the same day is preserved.
Limitations of Radix Sort
While radix sort is efficient, it has some limitations:
- Limited Use Case: Radix sort works best when the range of values (the number of digits or characters) is small. It’s particularly effective for integers or strings but may not work as well for floating-point numbers or arbitrary data types.
- Requires Extra Space: Radix sort requires extra space for the counting array and the output array, leading to O(n + k) space complexity, where
k
is the number of digits or characters. - Not a General-Purpose Sort: Radix sort is highly specialized for integers or fixed-length strings. For general-purpose sorting, comparison-based algorithms like quicksort or mergesort might be more appropriate.
Conclusion
Radix sort is a powerful, non-comparison-based sorting algorithm, particularly useful for efficiently sorting integers or strings. It works by sorting the digits or characters of the numbers from the least significant to the most significant digit. Thanks to its stable sorting subroutine, radix sort guarantees that the relative order of equal elements is preserved, making it an ideal choice when sorting by multiple attributes.
Congratulations on reading to the end of this tutorial!
For further reading on sorting algorithms in Python, go to the articles:
- How to do Insertion Sort in Python
- How to Do Bubble Sort in Python
- How to Do Selection Sort in Python
- How to do Merge Sort in Python
- How to Do Bucket Sort in Python
- How to Do Heap Sort in Python
- How to Do Pigeonhole Sort in Python
- How To Do Comb Sort in Python
- How to Do Shell Sort in Python
- How To Do TimSort in Python
Go to the online courses page on Python to learn more about Python for data science and machine learning.
Have fun and happy researching!
Suf is a senior advisor in data science with deep expertise in Natural Language Processing, Complex Networks, and Anomaly Detection. Formerly a postdoctoral research fellow, he applied advanced physics techniques to tackle real-world, data-heavy industry challenges. Before that, he was a particle physicist at the ATLAS Experiment of the Large Hadron Collider. Now, he’s focused on bringing more fun and curiosity to the world of science and research online.