Introduction
Counting sort is an efficient, non-comparison-based sorting algorithm that works well when the range of input values is known and limited. It is particularly useful for sorting integers and can achieve linear time complexity in many cases. This blog post will explain how counting sort works, provide a Python implementation, and discuss the stability of the algorithm and its significance.
What is Counting Sort?
Counting sort is a sorting algorithm that avoids comparisons by using an auxiliary array (called a counting array) to count the occurrences of each unique element in the input array. It then uses these counts to determine the correct position of each element in the output array.
How Counting Sort Works:
- Create a counting array to store the frequency of each element in the input array.
- Modify the counting array by accumulating the counts so that each position in the counting array represents the position of an element in the sorted array.
- Build the output array by placing elements in their correct positions based on the counting array.
- Copy the output array back to the original array to get the sorted result.
Time Complexity:
- Best Case: O(n + k) where
n
is the number of elements in the input array andk
is the range of input values. - Worst Case: O(n + k)
Space Complexity:
- O(n + k) due to the extra space required for the counting array and the output array.
Counting Sort Algorithm in Python
Here’s how you can implement counting sort in Python:
def counting_sort(arr): # Find the maximum value to define the size of the counting array max_val = max(arr) # Create a counting array with zeros count = [0] * (max_val + 1) # Store the count of each element in the input array for num in arr: count[num] += 1 # Modify the counting array to store cumulative counts for i in range(1, len(count)): count[i] += count[i - 1] # Create an output array to hold the sorted result output = [0] * len(arr) # Build the output array using the counting array for num in reversed(arr): output[count[num] - 1] = num count[num] -= 1 # Copy the output array back to the original array for i in range(len(arr)): arr[i] = output[i] # Example usage: data = [4, 2, 2, 8, 3, 3, 1] counting_sort(data) print("Sorted array:", data)
Output:
Sorted array: [1, 2, 2, 3, 3, 4, 8]
Step-by-Step Explanation of Counting Sort
Let’s break down how counting sort works on the array [4, 2, 2, 8, 3, 3, 1]
:
Step 1: Create a Counting Array
- First, find the maximum value in the array (
max_val = 8
), and create a counting array withmax_val + 1
elements initialized to0
.
Count array (initialized): [0, 0, 0, 0, 0, 0, 0, 0, 0]
Step 2: Count Each Element
- Traverse the input array and count the frequency of each element:
Input array: [4, 2, 2, 8, 3, 3, 1] Count array after counting: [0, 1, 2, 2, 1, 0, 0, 0, 1]
Explanation:
- 1 appears once, so count array at index 1 becomes
1
. - 2 appears twice, so index 2 becomes
2
. - 3 appears twice, so index 3 becomes
2
, and so on.
Step 3: Cumulative Count
- Modify the count array so that each position shows the cumulative count. This tells us where each element should be placed in the output array.
Modified Count array (cumulative count): [0, 1, 3, 5, 6, 6, 6, 6, 7]
Explanation:
- Index 1 shows
1
(there’s 1 element ≤ 1), - Index 2 shows
3
(there are 3 elements ≤ 2), - Index 3 shows
5
(there are 5 elements ≤ 3), and so on.
Step 4: Build the Output Array
- Now go through the input array backwards. Use the cumulative count to place each element in the correct position in the output array. After placing an element, decrease its count by 1.
Output array: [1, 2, 2, 3, 3, 4, 8]
Explanation:
- Starting from the last element
1
, place it in position 0 (based on count array), then decrease the count of 1. - Then move to the next element
3
, place it in position 4, and so on.
Step 5: Copy the Output to the Original Array
- Finally, copy the sorted output array back into the original array.
Sorted array: [1, 2, 2, 3, 3, 4, 8]
More In-Depth: Why Place Element 3 in Position 4
When we modify the count array to store cumulative counts, each index tells us how many elements are less than or equal to the value represented by that index. In the case of the value 3
, the count array at index 3
is 5
. This means there are 5 elements in the array that are less than or equal to 3
.
However, since arrays are 0-indexed (the first position is index 0), the fifth element (when counting from the first) will be placed at position 4 (which is index 4).
Let’s break it down:
- The count array shows the cumulative count for each value.
Cumulative count array: [0, 1, 3, 5, 6, 6, 6, 6, 7]
For value 3
, the count is 5
, meaning the fifth position in the sorted array should be for the value 3
.
Arrays are 0-indexed, so the “fifth” position is actually index 4 in a 0-indexed array.
When we insert the first 3 from the input array into the output array, we place it in position 4 (index 4). After inserting it, we decrease the count of 3
in the count array to 4
, so the next 3
(if any) would be placed in position 3.
Here’s how it looks in practice:
- Cumulative count array before placing any
3
:[0, 1, 3, 5, 6, 6, 6, 6, 7]
- The count of
3
is5
, so we place the first3
at index4
in the output array.
- The count of
- Cumulative count array after placing the first
3
:[0, 1, 3, 4, 6, 6, 6, 6, 7]
- Now, the next
3
will go to index3
.
- Now, the next
This process ensures all elements are placed in their correct positions in the sorted array.
Is Counting Sort a Stable Sorting Algorithm?
Counting sort is a stable sorting algorithm, meaning it preserves the relative order of equal elements in the input array. In other words, if two elements are equal, they will appear in the same relative order in the sorted array as in the input array.
Why Stability is Useful:
Stability in a sorting algorithm is important when:
- Multiple Attributes: Stability is particularly useful when sorting objects with multiple attributes. For example, if you’re sorting a list of students by grade and then by name, a stable sort would maintain the order of students with the same grade.
- Consistency: Stability ensures that equal elements retain their relative positions, which is crucial for applications like radix sort, where counting sort is used as a subroutine.
Limitations of Counting Sort
Counting sort is efficient in certain cases, but it has some limitations:
- Limited Use Case: Counting sort works well when the range of the input values (
k
) is not significantly larger than the number of elements (n
). Whenk
is much larger thann
, counting sort can become inefficient in terms of space complexity. - Not Comparison-Based: Counting sort does not perform comparisons between elements, which limits its flexibility compared to comparison-based algorithms like quick sort or merge sort.
Conclusion
Counting sort is a highly efficient sorting algorithm for specific cases where the range of the input values is known and limited. It is a stable sorting algorithm that preserves the relative order of equal elements, which is useful when sorting data with multiple attributes. While counting sort is not ideal for all datasets, it can be a powerful tool for integer sorting in O(n + k) time complexity.
Congratulations on reading to the end of this tutorial!
Read the following articles to learn how to implement Counting Sort:
In C++ – How to do Counting Sort in C++
In JavaScript – How To Do Counting Sort in JavaScript
Have fun and happy researching!
For further reading on sorting algorithms in Python, go to the articles:
- How to do Insertion Sort in Python
- How to Do Bubble Sort in Python
- How to Do Selection Sort in Python
- How to Do Bucket Sort in Python
- How to Do Heap Sort in Python
- How to Do Pigeonhole Sort in Python
- How To Do Comb Sort in Python
- How to Do Shell Sort in Python
- How To Do TimSort in Python
Go to the online courses page on Python to learn more about Python for data science and machine learning.
Have fun and happy researching!
Suf is a senior advisor in data science with deep expertise in Natural Language Processing, Complex Networks, and Anomaly Detection. Formerly a postdoctoral research fellow, he applied advanced physics techniques to tackle real-world, data-heavy industry challenges. Before that, he was a particle physicist at the ATLAS Experiment of the Large Hadron Collider. Now, he’s focused on bringing more fun and curiosity to the world of science and research online.