Pigeonhole Sort is a non-comparison-based sorting algorithm that excels when the range of the input values is small relative to the number of elements. It operates based on the pigeonhole principle, efficiently sorting elements by placing them into “holes” or “slots” and then collecting them back in order.
In this post, we’ll explore how to implement Pigeonhole Sort in C++, discuss its performance characteristics, and examine where it is most effective compared to other sorting algorithms. This algorithm is particularly suited for sorting integer values within a known and limited range.
Time Complexity
- Best, Worst, and Average Case: O(n + r), where
n
is the number of elements andr
is the range (difference between the largest and smallest element). - Space Complexity: O(n + r), as it requires extra space to store the pigeonholes.
Pseudocode for Pigeonhole Sort
function pigeonholeSort(arr): n = length(arr) # Step 1: Find minimum and maximum values in the array minValue = arr[0] maxValue = arr[0] for i from 1 to n-1: if arr[i] < minValue: minValue = arr[i] if arr[i] > maxValue: maxValue = arr[i] # Step 2: Calculate the range of the values range = maxValue - minValue + 1 # Step 3: Create an array of pigeonholes (empty lists) holes = array of empty lists with size 'range' # Step 4: Place each element into its respective pigeonhole for i from 0 to n-1: index = arr[i] - minValue append arr[i] to holes[index] # Step 5: Collect the elements from pigeonholes in order index = 0 for each hole in holes: for each element in hole: arr[index] = element index = index + 1
Explanation of Each Step:
- Step 1: Find Minimum and Maximum Values
- We first identify the smallest (
minValue
) and largest (maxValue
) elements in the array, which help determine the number of pigeonholes needed.
- We first identify the smallest (
- Step 2: Calculate the Range
- The range is calculated as
maxValue - minValue + 1
. This gives us the number of pigeonholes required to accommodate every unique value between the minimum and maximum values.
- The range is calculated as
- Step 3: Create Pigeonholes
- We create an array (or list of lists) called
holes
, where each pigeonhole corresponds to a unique value within the calculated range. Each pigeonhole is initialized as an empty list.
- We create an array (or list of lists) called
- Step 4: Place Elements into Pigeonholes
- For each element in the input array, we calculate its pigeonhole index by subtracting the
minValue
from the element’s value (arr[i] - minValue
). The element is then placed into the corresponding pigeonhole (i.e., appended to the list).
- For each element in the input array, we calculate its pigeonhole index by subtracting the
- Step 5: Collect Elements from Pigeonholes
- Finally, we iterate through the pigeonholes in order, and for each element in the pigeonholes, we place it back into the original array. This gives us the sorted array.
C++ Implementation of Pigeonhole Sort
Here’s the complete C++ code for Pigeonhole Sort:
#include <iostream> #include <vector> using namespace std; void pigeonholeSort(int arr[], int n) { // 1. Find the minimum and maximum values in the array int minValue = arr[0]; int maxValue = arr[0]; for (int i = 1; i < n; i++) { if (arr[i] < minValue) { minValue = arr[i]; } if (arr[i] > maxValue) { maxValue = arr[i]; } } // 2. Compute the range of the elements (maxValue - minValue + 1) int range = maxValue - minValue + 1; // 3. Create an array of pigeonholes (each slot for one value in the range) vector<int> holes[range]; // 4. Place each element in its respective pigeonhole for (int i = 0; i < n; i++) { holes[arr[i] - minValue].push_back(arr[i]); } // 5. Collect the elements back into the original array int index = 0; // To store the position in the original array for (int i = 0; i < range; i++) { for (size_t j = 0; j < holes[i].size(); j++) { arr[index++] = holes[i][j]; } } } // Function to print an array void printArray(int arr[], int n) { for (int i = 0; i < n; i++) { cout << arr[i] << " "; } cout << endl; } int main() { int arr[] = {8, 3, 2, 7, 4, 6, 8, 1, 9}; int n = sizeof(arr) / sizeof(arr[0]); cout << "Original array: "; printArray(arr, n); pigeonholeSort(arr, n); cout << "Sorted array: "; printArray(arr, n); return 0; }
Output:
Original array: 8 3 2 7 4 6 8 1 9 Sorted array: 1 2 3 4 6 7 8 8 9
Example Walkthrough
Consider the input array:
{8, 3, 2, 7, 4, 6, 8, 1, 9}
The minimum value is 1, and the maximum value is 9. The range of values is 9 - 1 + 1 = 9
, so we need nine pigeonholes (from 1 to 9).
We place each element into its respective pigeonhole:
8
goes into pigeonhole7
(since8 - 1 = 7
),3
goes into pigeonhole2
(since3 - 1 = 2
),2
goes into pigeonhole1
(since2 - 1 = 1
),7
goes into pigeonhole6
(since7 - 1 = 6
),4
goes into pigeonhole3
(since4 - 1 = 3
),6
goes into pigeonhole5
(since6 - 1 = 5
),8
goes into pigeonhole7
again (since8 - 1 = 7
),1
goes into pigeonhole0
(since1 - 1 = 0
),9
goes into pigeonhole8
(since9 - 1 = 8
).
After distributing all elements, the pigeonholes will contain:
- Pigeonhole 0:
{1}
- Pigeonhole 1:
{2}
- Pigeonhole 2:
{3}
- Pigeonhole 3:
{4}
- Pigeonhole 4:
{}
- Pigeonhole 5:
{6}
- Pigeonhole 6:
{7}
- Pigeonhole 7:
{8, 8}
- Pigeonhole 8:
{9}
The final step is to collect the elements from the pigeonholes and place them back into the original array, resulting in a sorted array:
Pigeonhole Sort vs. Counting Sort
Both Pigeonhole Sort and Counting Sort are non-comparison-based sorting algorithms, ideal for small ranges of integers. However, they differ in approach and use cases.
- Concept:
- Pigeonhole Sort sorts elements into “pigeonholes” (buckets) based on their values and collects them in order.
- Counting Sort counts occurrences of each value in a count array, then uses cumulative sums to place elements in the sorted order.
- Data Structure:
- Pigeonhole Sort uses an array of lists (pigeonholes) to store elements temporarily.
- Counting Sort uses a single count array to store the frequency of each value.
- Handling Duplicates:
- Pigeonhole Sort stores duplicates in the same pigeonhole.
- Counting Sort increments the count for each duplicate value.
- Space Complexity:
- Both algorithms have O(n + r) space complexity, but Counting Sort generally uses less memory because it only needs a count array.
- Use Cases:
- Pigeonhole Sort is better suited for small ranges with many duplicates.
- Counting Sort excels at counting frequencies and is often used with other algorithms like Radix Sort.
Feature | Pigeonhole Sort | Counting Sort |
---|---|---|
Concept | Buckets based on value | Counting occurrences of values |
Data Structure | Array of lists (pigeonholes) | Single count array |
Duplicates | Stored in the same pigeonhole | Incremented in count array |
Space Complexity | O(n + r) | O(n + r) |
Best Use Case | Small ranges with duplicates | Efficient counting for larger datasets |
Advantages of Pigeonhole Sort
- Efficient for small ranges: Pigeonhole sort can be very efficient when the range of values is small relative to the number of elements.
- Non-comparison-based: Unlike algorithms such as QuickSort or MergeSort, Pigeonhole Sort does not rely on comparing elements, making it ideal for cases where comparisons are costly.
Limitations of Pigeonhole Sort
- Space Complexity: Pigeonhole Sort requires extra space to store pigeonholes, which can be inefficient if the range is large relative to the number of elements.
- Not Suitable for Large Ranges: If the range of values is much larger than the number of elements, this sorting method becomes inefficient in both time and space.
When to Use Pigeonhole Sort
Pigeonhole Sort is not a general-purpose sorting algorithm, but it can be extremely effective in certain scenarios. Here’s when you should consider using Pigeonhole Sort:
- Small Range of Values:
Pigeonhole Sort is ideal when the range of values in the dataset is relatively small compared to the number of elements. - Uniform Distribution of Elements:
If the elements are fairly uniformly distributed across the possible values in the range, - When Comparisons Are Expensive:
Pigeonhole Sort is a non-comparison-based sorting algorithm which doesn’t rely on comparing elements like QuickSort or MergeSort. This makes it useful when comparisons are costly or when working with data types where comparison operations are computationally expensive. - Duplicates Are Common:
If you expect multiple duplicate values in your dataset, Pigeonhole Sort handles them naturally by placing multiple elements into the same pigeonhole, making it efficient.
When Not to Use Pigeonhole Sort
- Large Range of Values:
If the range of values is much larger than the number of elements (e.g., sorting an array of 10 elements with values ranging from 1 to 1,000,000). Pigeonhole Sort is inefficient due to the excessive number of pigeonholes needed, leading to high space complexity. - General-Purpose Sorting:
For general sorting purposes, algorithms like Quick Sort, MergeSort, or Timsort (used in Python’s built-in sort) are better suited. These algorithms are more efficient for large datasets and have better average performance across various input types.
Conclusion
Pigeonhole Sort is a useful sorting algorithm when the range of values and the number of elements are relatively small. However, the algorithm’s space requirements make it less practical for larger datasets with a broader range.
More sophisticated algorithms like Quick Sort or Timsort are typically preferred for general-purpose sorting. Still, Pigeonhole Sort is an excellent tool for your algorithm toolkit when sorting specific data types.
Congratulations on reading to the end of this tutorial!
Read the following articles to learn how to implement Cocktail Sort:
In JavaScript – How to do Cocktail in JavaScript
In Python – How to do Cocktail Sort in Python
Have fun and happy researching!
Suf is a senior advisor in data science with deep expertise in Natural Language Processing, Complex Networks, and Anomaly Detection. Formerly a postdoctoral research fellow, he applied advanced physics techniques to tackle real-world, data-heavy industry challenges. Before that, he was a particle physicist at the ATLAS Experiment of the Large Hadron Collider. Now, he’s focused on bringing more fun and curiosity to the world of science and research online.