How to Do Radix Sort in Python

by | DSA, Programming, Python

Radix sort is a highly efficient, non-comparison-based sorting algorithm that works by sorting digits of numbers in a specific order. It’s particularly useful for sorting large sets of integers or strings and can be faster than comparison-based algorithms like quick sort when the input size and range are large. In this blog post, we’ll explain how radix sort works, provide a Python implementation, and explore why it’s an excellent choice for specific use cases.


What is Radix Sort?

Radix sort is a non-comparison-based sorting algorithm that processes the digits (or characters) of the numbers one at a time. The sorting begins with the least significant digit (LSD) or most significant digit (MSD) and uses a stable subroutine, such as counting sort, to sort the digits in each pass. Radix sort is often used to sort large sets of integers, dates, or strings, where it can achieve linear time complexity under certain conditions.

How Radix Sort Works:

  1. Start by sorting the numbers based on their least significant digit (LSD).
  2. Move to the next digit (second least significant) and sort the array again.
  3. Repeat this process for all digits, from least significant to most significant.
  4. After processing all digits, the array is fully sorted.

Time Complexity:

  • Best Case: O(n * k), where n is the number of elements and k is the number of digits.
  • Worst Case: O(n * k)

Space Complexity:

  • O(n + k), where n is the number of elements, and k is the range of the processed digits.

Radix Sort Algorithm in Python

Radix sort typically uses counting sort as its stable sorting subroutine to handle sorting the digits. Here’s a Python implementation of radix sort:

# Helper function: Counting sort to sort array by the digit represented by exp
def counting_sort_by_digit(arr, exp):
    n = len(arr)
    output = [0] * n  # Output array to store sorted numbers
    count = [0] * 10  # There are 10 possible digits (0-9)

    # Store count of occurrences of each digit
    for i in range(n):
        index = (arr[i] // exp) % 10
        count[index] += 1

    # Modify count to store actual positions of digits in output array
    for i in range(1, 10):
        count[i] += count[i - 1]

    # Build the output array by placing numbers in their correct position
    for i in range(n - 1, -1, -1):
        index = (arr[i] // exp) % 10
        output[count[index] - 1] = arr[i]
        count[index] -= 1

    # Copy the sorted numbers back to the original array
    for i in range(n):
        arr[i] = output[i]

# Main radix sort function
def radix_sort(arr):
    # Find the maximum number to determine the number of digits
    max_val = max(arr)

    # Apply counting sort to sort based on each digit, from least significant to most significant
    exp = 1  # Initial exponent (1s place)
    while max_val // exp > 0:
        counting_sort_by_digit(arr, exp)
        exp *= 10  # Move to the next digit place (10s, 100s, etc.)

# Example usage:
data = [170, 45, 75, 90, 802, 24, 2, 66]
radix_sort(data)
print("Sorted array:", data)

Output:

Sorted array: [2, 24, 45, 66, 75, 90, 170, 802]

Step-by-Step Explanation of Radix Sort

To understand how radix sort works, let’s break down the sorting process for the array [170, 45, 75, 90, 802, 24, 2, 66] step by step.

Step 1: Sort by Least Significant Digit (LSD)

In the first pass, we will sort the numbers based on their least significant digit (1s place). This means that we will compare the last digits of the numbers and sort accordingly.

Original array: [170, 45, 75, 90, 802, 24, 2, 66]

Here’s how the sorting works:

  • 170 (0), 45 (5), 75 (5), 90 (0), 802 (2), 24 (4), 2 (2), 66 (6)
  • Sort by these digits: [170, 90, 802, 2, 24, 45, 75, 66]

After sorting by the 1s digit, the array becomes:

Array after sorting by 1s digit: [170, 90, 802, 2, 24, 45, 75, 66]

Step 2: Sort by 10s Digit

Next, we move on to the 10s digit (second least significant digit) and sort the array based on these digits:

  • 170 (7), 90 (9), 802 (0), 2 (0), 24 (2), 45 (4), 75 (7), 66 (6)
  • Sort by these digits: [802, 2, 24, 45, 66, 75, 170, 90]

After sorting by the 10s digit, the array becomes:

Array after sorting by 10s digit: [802, 2, 24, 45, 66, 75, 170, 90]

Step 3: Sort by 100s Digit

Now, we sort by the 100s digit (third least significant digit). For numbers that do not have 100s digits, we consider them as having a leading zero (0).

  • 802 (8), 2 (0), 24 (0), 45 (0), 66 (0), 75 (0), 170 (1), 90 (0)
  • Sort by these digits: [2, 24, 45, 66, 75, 90, 170, 802]

After sorting by the 100s digit, the array becomes:

Array after sorting by 100s digit: [2, 24, 45, 66, 75, 90, 170, 802]

Final Sorted Array:

After sorting by all digits, we get the fully sorted array:

[2, 24, 45, 66, 75, 90, 170, 802]

What Happens After the Code Runs?

After the radix_sort() function runs, the array is fully sorted. Here’s a breakdown of what happens at each stage of the code:

  1. Find the Maximum Value:
    • The function first determines the maximum value in the array (max_val = 802). This is important because it tells us how many digits the largest number has, so we know how many passes (or digit-based sorts) we need to perform.
  2. Counting Sort by Each Digit:
    • The function starts sorting the array based on the least significant digit (1s place). After each pass, it moves to the next digit (10s, 100s, etc.) until all digits are sorted.
    • In each pass, counting sort is used to ensure that the digits are sorted while maintaining the stability of the sort (i.e., preserving the relative order of numbers with the same digits).
  3. Exponent Multiplies:
    • The variable exp controls which digit we are sorting by (1 for 1s place, 10 for 10s place, 100 for 100s place). After each pass, exp is multiplied by 10 to move to the next higher place value.

Is Radix Sort a Stable Sorting Algorithm?

Yes, radix sort is a stable sorting algorithm because it preserves the relative order of equal elements. Stability is guaranteed because counting sort, which is used as the subroutine, is itself stable. This means that numbers with the same digit in one place will maintain their order relative to each other when sorting by another place.

Why Stability is Useful:

Stability is particularly important in radix sort because we sort the numbers multiple times by different digits. If we didn’t use a stable sort for each digit, the relative order of numbers with the same digit could be lost, leading to incorrect results.

For example:

  • If you’re sorting dates by day, month, and year, stability ensures that when sorting by month and year, the relative order of dates with the same day is preserved.

Limitations of Radix Sort

While radix sort is efficient, it has some limitations:

  1. Limited Use Case: Radix sort works best when the range of values (the number of digits or characters) is small. It’s particularly effective for integers or strings but may not work as well for floating-point numbers or arbitrary data types.
  2. Requires Extra Space: Radix sort requires extra space for the counting array and the output array, leading to O(n + k) space complexity, where k is the number of digits or characters.
  3. Not a General-Purpose Sort: Radix sort is highly specialized for integers or fixed-length strings. For general-purpose sorting, comparison-based algorithms like quicksort or mergesort might be more appropriate.

Conclusion

Radix sort is a powerful, non-comparison-based sorting algorithm, particularly useful for efficiently sorting integers or strings. It works by sorting the digits or characters of the numbers from the least significant to the most significant digit. Thanks to its stable sorting subroutine, radix sort guarantees that the relative order of equal elements is preserved, making it an ideal choice when sorting by multiple attributes.

Congratulations on reading to the end of this tutorial!

For further reading on sorting algorithms in Python, go to the articles:

Go to the online courses page on Python to learn more about Python for data  science and  machine learning.

Have fun and happy researching!

Profile Picture
Research Scientist at Moogsoft | + posts

Suf is a research scientist at Moogsoft, specializing in Natural Language Processing and Complex Networks. Previously he was a Postdoctoral Research Fellow in Data Science working on adaptations of cutting-edge physics analysis techniques to data-intensive problems in industry. In another life, he was an experimental particle physicist working on the ATLAS Experiment of the Large Hadron Collider. His passion is to share his experience as an academic moving into industry while continuing to pursue research. Find out more about the creator of the Research Scientist Pod here and sign up to the mailing list here!