How to do Counting Sort in Python

by | DSA, Programming, Python, Tips

Introduction

Counting sort is an efficient, non-comparison-based sorting algorithm that works well when the range of input values is known and limited. It is particularly useful for sorting integers and can achieve linear time complexity in many cases. This blog post will explain how counting sort works, provide a Python implementation, and discuss the stability of the algorithm and its significance.


What is Counting Sort?

Counting sort is a sorting algorithm that avoids comparisons by using an auxiliary array (called a counting array) to count the occurrences of each unique element in the input array. It then uses these counts to determine the correct position of each element in the output array.

How Counting Sort Works:

  1. Create a counting array to store the frequency of each element in the input array.
  2. Modify the counting array by accumulating the counts so that each position in the counting array represents the position of an element in the sorted array.
  3. Build the output array by placing elements in their correct positions based on the counting array.
  4. Copy the output array back to the original array to get the sorted result.

Time Complexity:

  • Best Case: O(n + k) where n is the number of elements in the input array and k is the range of input values.
  • Worst Case: O(n + k)

Space Complexity:

  • O(n + k) due to the extra space required for the counting array and the output array.

Counting Sort Algorithm in Python

Here’s how you can implement counting sort in Python:

def counting_sort(arr):
    # Find the maximum value to define the size of the counting array
    max_val = max(arr)

    # Create a counting array with zeros
    count = [0] * (max_val + 1)

    # Store the count of each element in the input array
    for num in arr:
        count[num] += 1

    # Modify the counting array to store cumulative counts
    for i in range(1, len(count)):
        count[i] += count[i - 1]

    # Create an output array to hold the sorted result
    output = [0] * len(arr)

    # Build the output array using the counting array
    for num in reversed(arr):
        output[count[num] - 1] = num
        count[num] -= 1

    # Copy the output array back to the original array
    for i in range(len(arr)):
        arr[i] = output[i]

# Example usage:
data = [4, 2, 2, 8, 3, 3, 1]
counting_sort(data)
print("Sorted array:", data)

Output:

Sorted array: [1, 2, 2, 3, 3, 4, 8]

Step-by-Step Explanation of Counting Sort

Let’s break down how counting sort works on the array [4, 2, 2, 8, 3, 3, 1]:

Step 1: Create a Counting Array

  • First, find the maximum value in the array (max_val = 8), and create a counting array with max_val + 1 elements initialized to 0.
Count array (initialized): [0, 0, 0, 0, 0, 0, 0, 0, 0]

Step 2: Count Each Element

  • Traverse the input array and count the frequency of each element:
Input array: [4, 2, 2, 8, 3, 3, 1]
Count array after counting: [0, 1, 2, 2, 1, 0, 0, 0, 1]

Explanation:

  • 1 appears once, so count array at index 1 becomes 1.
  • 2 appears twice, so index 2 becomes 2.
  • 3 appears twice, so index 3 becomes 2, and so on.

Step 3: Cumulative Count

  • Modify the count array so that each position shows the cumulative count. This tells us where each element should be placed in the output array.
Modified Count array (cumulative count): [0, 1, 3, 5, 6, 6, 6, 6, 7]

Explanation:

  • Index 1 shows 1 (there’s 1 element ≤ 1),
  • Index 2 shows 3 (there are 3 elements ≤ 2),
  • Index 3 shows 5 (there are 5 elements ≤ 3), and so on.

Step 4: Build the Output Array

  • Now go through the input array backwards. Use the cumulative count to place each element in the correct position in the output array. After placing an element, decrease its count by 1.
Output array: [1, 2, 2, 3, 3, 4, 8]

Explanation:

  • Starting from the last element 1, place it in position 0 (based on count array), then decrease the count of 1.
  • Then move to the next element 3, place it in position 4, and so on.

Step 5: Copy the Output to the Original Array

  • Finally, copy the sorted output array back into the original array.
Sorted array: [1, 2, 2, 3, 3, 4, 8]

More In-Depth: Why Place Element 3 in Position 4

When we modify the count array to store cumulative counts, each index tells us how many elements are less than or equal to the value represented by that index. In the case of the value 3, the count array at index 3 is 5. This means there are 5 elements in the array that are less than or equal to 3.

However, since arrays are 0-indexed (the first position is index 0), the fifth element (when counting from the first) will be placed at position 4 (which is index 4).

Let’s break it down:

  1. The count array shows the cumulative count for each value.
Cumulative count array: [0, 1, 3, 5, 6, 6, 6, 6, 7]

For value 3, the count is 5, meaning the fifth position in the sorted array should be for the value 3.

Arrays are 0-indexed, so the “fifth” position is actually index 4 in a 0-indexed array.

When we insert the first 3 from the input array into the output array, we place it in position 4 (index 4). After inserting it, we decrease the count of 3 in the count array to 4, so the next 3 (if any) would be placed in position 3.

Here’s how it looks in practice:

  • Cumulative count array before placing any 3: [0, 1, 3, 5, 6, 6, 6, 6, 7]
    • The count of 3 is 5, so we place the first 3 at index 4 in the output array.
  • Cumulative count array after placing the first 3: [0, 1, 3, 4, 6, 6, 6, 6, 7]
    • Now, the next 3 will go to index 3.

This process ensures all elements are placed in their correct positions in the sorted array.


Is Counting Sort a Stable Sorting Algorithm?

Counting sort is a stable sorting algorithm, meaning it preserves the relative order of equal elements in the input array. In other words, if two elements are equal, they will appear in the same relative order in the sorted array as in the input array.

Why Stability is Useful:

Stability in a sorting algorithm is important when:

  1. Multiple Attributes: Stability is particularly useful when sorting objects with multiple attributes. For example, if you’re sorting a list of students by grade and then by name, a stable sort would maintain the order of students with the same grade.
  2. Consistency: Stability ensures that equal elements retain their relative positions, which is crucial for applications like radix sort, where counting sort is used as a subroutine.

Limitations of Counting Sort

Counting sort is efficient in certain cases, but it has some limitations:

  1. Limited Use Case: Counting sort works well when the range of the input values (k) is not significantly larger than the number of elements (n). When k is much larger than n, counting sort can become inefficient in terms of space complexity.
  2. Not Comparison-Based: Counting sort does not perform comparisons between elements, which limits its flexibility compared to comparison-based algorithms like quick sort or merge sort.

Conclusion

Counting sort is a highly efficient sorting algorithm for specific cases where the range of the input values is known and limited. It is a stable sorting algorithm that preserves the relative order of equal elements, which is useful when sorting data with multiple attributes. While counting sort is not ideal for all datasets, it can be a powerful tool for integer sorting in O(n + k) time complexity.

Congratulations on reading to the end of this tutorial!

For further reading on sorting algorithms in Python, go to the articles:

Go to the online courses page on Python to learn more about Python for data  science and  machine learning.

Have fun and happy researching!

Profile Picture
Research Scientist at Moogsoft | + posts

Suf is a research scientist at Moogsoft, specializing in Natural Language Processing and Complex Networks. Previously he was a Postdoctoral Research Fellow in Data Science working on adaptations of cutting-edge physics analysis techniques to data-intensive problems in industry. In another life, he was an experimental particle physicist working on the ATLAS Experiment of the Large Hadron Collider. His passion is to share his experience as an academic moving into industry while continuing to pursue research. Find out more about the creator of the Research Scientist Pod here and sign up to the mailing list here!