Sorting algorithms are fundamental to many computer science problems, and understanding how they work is crucial for optimizing performance. In this post, we’ll explore **Selection Sort**, a simple but inefficient sorting algorithm. We’ll go through the algorithm step-by-step, provide a C++ implementation, and include performance testing to understand its behaviour on different data sets.

## Introduction to Selection Sort

Selection Sort is a comparison-based sorting algorithm that divides the input list into two parts: the sorted part and the unsorted part. Initially, the sorted part is empty, and the unsorted part contains all the elements. The algorithm repeatedly selects the smallest (or largest) element from the unsorted part and moves it to the sorted part.

### Time Complexity:

**Best, Average, and Worst Case**:*O(n²)*

Due to its quadratic time complexity, Selection Sort is generally inefficient for large datasets. However, it’s easy to understand and useful for educational purposes or small data sets.

## Understanding the Selection Sort Algorithm

Selection Sort works by maintaining two parts of the array:

**Sorted Part**: Initially empty, this grows as elements are moved into their final sorted positions.**Unsorted Part**: Initially contains the entire array, and elements are moved out of it one by one.

### Steps:

- Find the smallest element in the unsorted part of the array.
- Swap it with the first element of the unsorted part.
- Move the boundary between the sorted and unsorted parts one step to the right.
- Repeat the process until all elements are sorted.

## Pseudocode for Selection Sort

Here’s a simple pseudocode for Selection Sort to illustrate its structure before jumping into the actual code:

function selectionSort(arr, n): for i from 0 to n - 1: minIndex = i for j from i + 1 to n: if arr[j] < arr[minIndex]: minIndex = j swap arr[i] and arr[minIndex]

### Explanation:

- The outer loop iterates over each element in the array.
- The inner loop finds the minimum element in the remaining unsorted part of the array.
- Once the minimum is found, it’s swapped with the element at the current position
`i`

.

## Implementing Selection Sort in C++

Now that we’ve seen the algorithm and pseudocode, let’s implement Selection Sort in C++.

### Basic C++ Implementation

#include <iostream> #include <vector> using namespace std; // Function to swap two elements void swapElements(int &a, int &b) { int temp = a; a = b; b = temp; } // Selection Sort function void selectionSort(vector<int>& arr) { int n = arr.size(); for (int i = 0; i < n - 1; i++) { int minIndex = i; // Find the minimum element in the remaining unsorted array for (int j = i + 1; j < n; j++) { if (arr[j] < arr[minIndex]) { minIndex = j; } } // Swap the found minimum element with the first element swapElements(arr[i], arr[minIndex]); } } // Function to print the array void printArray(const vector<int>& arr) { for (int num : arr) { cout << num << " "; } cout << endl; } // Main function to demonstrate Selection Sort int main() { vector<int> arr = {64, 25, 12, 22, 11}; cout << "Original array: "; printArray(arr); selectionSort(arr); cout << "Sorted array: "; printArray(arr); return 0; }

### Explanation:

**swapElements Function**: Swaps two elements using a temporary variable.**selectionSort Function**: Implements the selection sort logic, where the minimum element is found and swapped with the current index`i`

.**printArray Function**: Prints the array before and after sorting.**main Function**: Demonstrates the sorting process on a sample array.

**Output:**

Original array: 64 25 12 22 11 Sorted array: 11 12 22 25 64

## Performance Testing Selection Sort

Since Selection Sort has a time complexity of *O(n²)*, it’s expected to perform poorly on large datasets. We will now test its performance with different array sizes and types.

### Test Setup:

**Random Array**: Unordered elements.**Sorted Array**: Already sorted in ascending order (best case).**Reverse Sorted Array**: Sorted in descending order (worst case).

### Performance Testing Code

We’ll use the `<chrono>`

library in C++ to measure the execution time of Selection Sort.

#include <iostream> #include <vector> #include <algorithm> // For std::shuffle #include <chrono> #include <random> using namespace std; // Function to swap two elements void swapElements(int &a, int &b) { int temp = a; a = b; b = temp; } // Selection Sort function void selectionSort(vector<int>& arr) { int n = arr.size(); for (int i = 0; i < n - 1; i++) { int minIndex = i; for (int j = i + 1; j < n; j++) { if (arr[j] < arr[minIndex]) { minIndex = j; } } swapElements(arr[i], arr[minIndex]); } } // Function to generate an array with specific type and size vector<int> generateArray(int size, string type) { vector<int> arr(size); for (int i = 0; i < size; i++) { arr[i] = i + 1; } if (type == "random") { shuffle(arr.begin(), arr.end(), default_random_engine(random_device()())); } else if (type == "reverse") { reverse(arr.begin(), arr.end()); } return arr; } // Function to test the performance of Selection Sort double testSelectionSort(vector<int> arr) { auto start = chrono::high_resolution_clock::now(); selectionSort(arr); auto end = chrono::high_resolution_clock::now(); chrono::duration<double, micro> duration = end - start; return duration.count(); } // Main function to run the performance tests int main() { vector<string> types = {"random", "sorted", "reverse"}; vector<int> sizes = {1000, 5000, 10000}; cout << "Selection Sort Performance Testing:\n"; cout << "Array Size\tType\t\tTime (μs)\n"; for (int size : sizes) { for (string type : types) { vector<int> arr = generateArray(size, type); double time = testSelectionSort(arr); cout << size << "\t\t" << type << "\t\t" << time << "\n"; } } return 0; }

**Output:**

Selection Sort Performance Testing: Array Size Type Time (μs) 1000 random 2148.75 1000 sorted 2202.84 1000 reverse 2369.55 5000 random 52047.7 5000 sorted 51684.8 5000 reverse 51111.3 10000 random 203980 10000 sorted 206320 10000 reverse 207827

It’s important to note that the performance testing times you observe may vary depending on several factors, including:

**Hardware Specifications**: The CPU, memory, and overall system performance can significantly affect how quickly the algorithm runs.**System Load**: If your system is running other processes or tasks while you’re testing, it can impact the measured times.**Compiler Optimizations**: Different C++ compilers and their optimization levels (e.g.,`-O2`

,`-O3`

in GCC) can influence the execution speed. Higher optimization levels often lead to faster runtime but can affect the accuracy of basic time measurements.**Input Variations**: Small fluctuations in random array generation or the exact sequence of operations can lead to minor differences in performance even for the same array size and type.

As a result, while the general trends and relative performance of Selection Sort will remain consistent, the specific time measurements may vary between testing environments.

**Key Observations:**

**Consistent Times Across Data Types:**- Unlike algorithms like Quick Sort, Selection Sort shows similar execution times for random, sorted, and reverse-sorted arrays. This is because Selection Sort always scans the entire unsorted part of the array, regardless of its order.
- For example, for
**Array Size 1000**, the times for random (2148.75 μs), sorted (2202.84 μs), and reverse (2369.55 μs) arrays are very close.

**Linear Increase with Array Size:**- The time taken increases quadratically as the array size increases from 1000 to 10000. This is expected given the
*O(n²)*time complexity of Selection Sort. - For
**Array Size 5000**, the time is around 51,000–52,000 μs, and for**Array Size 10000**, it jumps to around 203,000–207,000 μs.

- The time taken increases quadratically as the array size increases from 1000 to 10000. This is expected given the
**Comparison Across Array Types:**- Since Selection Sort does not take advantage of any order in the data, even sorted arrays do not improve performance. The sorted arrays take just as long to sort as the random or reverse-sorted ones, indicating that the algorithm’s inefficiency is consistent across different types of input.

## Pros and Cons of Selection Sort

**Pros**:

**Simple to Understand and Implement**:- Selection Sort is one of the simplest sorting algorithms, making it a good choice for educational purposes or small programs where efficiency is not a priority.

**In-Place Sorting**:- It does not require any additional memory beyond the array being sorted, making it memory efficient (with a space complexity of
*O(1)*).

- It does not require any additional memory beyond the array being sorted, making it memory efficient (with a space complexity of
**Performs Well on Small Datasets**:- For very small datasets, the performance difference between Selection Sort and more advanced algorithms may not be significant, making it useful in constrained scenarios.

**Predictable Behavior**:- The algorithm’s time complexity is always
*O(n²)*, regardless of the input data. This makes its performance easy to predict in various cases (best, average, worst).

- The algorithm’s time complexity is always

**Cons**:

**Inefficient for Large Datasets**:- With a time complexity of
*O(n²)*, Selection Sort becomes impractical for larger datasets, especially when compared to algorithms like Quick Sort or Merge Sort, which have an average time complexity of*O(n log n)*.

- With a time complexity of
**Does Not Take Advantage of Sorted Data**:- Even if the array is already sorted or partially sorted, Selection Sort still performs all of its comparisons, offering no performance improvements for these cases.

**Not a Stable Sort**:- Selection Sort is not stable, meaning that equal elements may not maintain their relative order after sorting. Stability is often desirable when sorting records with key-value pairs.

**Always Scans Entire Unsorted Array**:- Unlike more sophisticated algorithms (e.g., Quick Sort or Insertion Sort), Selection Sort does not stop early if the array becomes sorted during the process, leading to unnecessary comparisons.

## Conclusion

Selection Sort is easy to implement and understand but inefficient for large datasets due to its *O(n²)* time complexity. It is mainly used for small arrays or when simplicity is more important than performance. Our performance testing demonstrated that it takes significantly more time as the array size increases, especially for reverse-sorted data.

Congratulations on reading to the end of this tutorial!

Read the following articles to learn how to implement Selection Sort:

In Python – How to do Selection Sort in Python

In JavaScript – How to do Selection Sort in JavaScript

In Java – How to do Selection Sort in Java

For further reading on sorting algorithms in C++, go to the articles:

- How To Do Quick Sort in C++
- How To Do Heap Sort in C++
- Shell Sort in C++
- How To Do Merge Sort in C++
- How To Do Comb Sort in C++
- How To Do Insertion Sort in C++
- How To Do Radix Sort in C++
- How To Do Bucket Sort in C++

Have fun and happy researching!

Suf is a senior advisor in data science with deep expertise in Natural Language Processing, Complex Networks, and Anomaly Detection. Formerly a postdoctoral research fellow, he applied advanced physics techniques to tackle real-world, data-heavy industry challenges. Before that, he was a particle physicist at the ATLAS Experiment of the Large Hadron Collider. Now, he’s focused on bringing more fun and curiosity to the world of science and research online.