Insertion Sort is a simple and intuitive sorting algorithm that works similarly to how humans might sort playing cards. The algorithm sorts one element at a time by comparing it to the elements before inserting it in the correct position. In this blog post, we will walk through the steps of implementing Insertion Sort in Rust, compare it with Selection Sort, and provide performance benchmarks on various arrays.
What is Insertion Sort?
Insertion Sort is a comparison-based algorithm that builds a sorted array one element at a time. Starting with the first element (which is trivially sorted), the algorithm moves through the remaining elements and inserts each one into its correct position relative to the previously sorted portion of the array. This process is repeated until the entire array is sorted.
Key Steps in Insertion Sort:
- Start with the second element in the array.
- Compare it with elements before it and shift larger elements from one position to the right.
- Insert the current element into its correct position.
- Repeat this process for all elements in the array.
Below is our visualization tool to see Insertion Sort in real-time!
Why start at index 1?
Insertion Sort works by assuming the first element is already sorted (since a single element is trivially sorted). Therefore, the algorithm starts at index 1 and compares the element at that index with the previous ones, inserting it into its correct position in the sorted portion of the array.
Time Complexity of Insertion Sort
- Best Case (Sorted Data): O(n) – When the array is already sorted, Insertion Sort makes at most one comparison per element, resulting in linear time.
- Average Case: O(n²) – In an unsorted array, each element must be compared to several previous elements.
- Worst Case (Reverse-Sorted Data): O(n²) – In a reverse-sorted array, each element must be compared with all previous elements and moved to the front, resulting in to the worst-case time complexity.
Space Complexity of Insertion Sort
Insertion Sort is an in-place sorting algorithm, meaning it sorts the array without requiring additional storage space. The space complexity is O(1), as only a few extra variables are used during the sorting process.
Pseudocode of Insertion Sort
Here’s the high-level pseudocode for Insertion Sort:
function insertionSort(arr): for i from 1 to length(arr): key = arr[i] j = i - 1 while j >= 0 and arr[j] > key: arr[j + 1] = arr[j] j = j - 1 arr[j + 1] = key
Step-by-Step Explanation of the Pseudocode
- Start iterating from the second element (index 1), as the first element is already sorted.
- Store the current element in
key
. - Compare the
key
with the elements before it (arr[j]
), and shift the larger elements to the right. - Once you find the correct position for
key
, insert it. - Repeat this for each element in the array.
Rust Implementation of Insertion Sort
Let’s now implement Insertion Sort in Rust. As Rust emphasizes memory safety and ownership, there are some specifics about the language you must keep in mind while implementing sorting algorithms. Here’s the Rust implementation:
Rust Code
fn insertion_sort<T: Ord + Clone>(arr: &mut [T]) { let n = arr.len(); for i in 1..n { let key = arr[i].clone(); // Clone the element to avoid moving it let mut j = i; while j > 0 && arr[j - 1] > key { arr[j] = arr[j - 1].clone(); // Shift the element j -= 1; } arr[j] = key; // Insert the element at the correct position } } fn main() { let mut numbers = vec![12, 11, 13, 5, 6]; println!("Before sorting: {:?}", numbers); insertion_sort(&mut numbers); println!("After sorting: {:?}", numbers); }
Rust-Specific Details:
When working with Insertion Sort in Rust, there are a few key concepts and features of the language that need to be addressed:
- Borrowing and Ownership:
- Rust’s ownership model means you cannot simply move elements out of a collection like an array without transferring ownership. In typical cases, moving values out of a slice would result in an error because the element in the array would no longer be valid. This is why we need to clone values in some situations.
- Ord Trait:
- The
Ord
trait in Rust is used to define types that can be ordered. This means that the typeT
must implement theOrd
trait to allow comparisons between elements using operators like<
,>
,<=
, and>=
. In Insertion Sort, comparisons are required to determine the correct order of elements, so we addT: Ord
to the function signature.
- The
- Clone Trait:
- The Clone trait allows for explicit duplication of values in Rust. Some types also implement the Copy trait, which provides implicit duplication through bitwise copying. For non-Copy types, such as large structs or heap-allocated types like
Vec
, Clone is required for explicit duplication. In contexts like sorting algorithms, where we need to retain the original value while working with a duplicate, we use.clone()
. This prevents moving values from their original location, which could lead to borrowing issues. Although types that implement Copy can still use Clone, the compiler will automatically prefer the more efficient Copy behaviour when available.
- The Clone trait allows for explicit duplication of values in Rust. Some types also implement the Copy trait, which provides implicit duplication through bitwise copying. For non-Copy types, such as large structs or heap-allocated types like
Fix in Action:
- Cloning the Element: In the Insertion Sort implementation, we clone the
key
element atarr[i]
to avoid moving it. This allows the algorithm to keep the original element in place while comparing it with other elements. - Cloning During Shifting: While shifting elements in the array to the right, we also clone the shifted elements to ensure no moves are made that could violate ownership or borrowing rules in Rust.
Step-by-Step Explanation of Example with Key Points
Let’s walk through sorting the array [12, 11, 13, 5, 6]
using Insertion Sort in Rust:
- First Iteration (
key = 11
):- We clone
key = 11
and compare it with12
. Since11 < 12
, we shift12
to the right and insert11
at the beginning. - Array after the first iteration:
[11, 12, 13, 5, 6]
.
- We clone
- Second Iteration (
key = 13
):- We clone
key = 13
and compare it with12
. No shifting is required as13 > 12
. - Array remains:
[11, 12, 13, 5, 6]
.
- We clone
- Third Iteration (
key = 5
):- We clone
key = 5
and compare it with13
,12
, and11
. We shift all of these elements to the right and insert5
at the beginning. - Array after the third iteration:
[5, 11, 12, 13, 6]
.
- We clone
- Fourth Iteration (
key = 6
):- We clone
key = 6
and compare it with13
,12
, and11
. We shift12
and11
to the right, then insert6
. - Array after the fourth iteration:
[5, 6, 11, 12, 13]
.
- We clone
The array is now sorted!
Performance Test: Insertion Sort vs Selection Sort
Now that we have a working implementation of Insertion Sort, it’s time to compare its performance against Selection Sort, another simple sorting algorithm. Selection Sort works by repeatedly selecting the smallest element from the unsorted portion of the array and swapping it with the first unsorted element. Both algorithms have an O(n²) time complexity, but their behaviour varies depending on the input data. We will test both algorithms on random, sorted, and reverse-sorted arrays of varying sizes to understand how they perform under different conditions.
We expect Insertion Sort to outperform Selection Sort on sorted arrays due to its O(n) best-case complexity. At the same time, both should exhibit similar performance on random and reverse-sorted arrays.
We’ll use Rust’s std::time::Instant
for benchmarking and for generating arrays, we can utilize the rand
crate to create random arrays and manual loops to generate sorted and reverse-sorted arrays.
When performing your tests, include rand
in the under the [dependencies]
section of your Cargo.toml
file, for example:
[package] name = "insertion_sort" version = "0.1.0" edition = "2021" [dependencies] rand = "0.9.0-alpha.2"
Rust Code for Performance Test
use rand::seq::SliceRandom; use std::time::Instant; // Insertion Sort Implementation fn insertion_sort<T: Ord + Clone>(arr: &mut [T]) { let n = arr.len(); for i in 1..n { let key = arr[i].clone(); let mut j = i; while j > 0 && arr[j - 1] > key { arr[j] = arr[j - 1].clone(); j -= 1; } arr[j] = key; } } // Selection Sort Implementation fn selection_sort<T: Ord>(arr: &mut [T]) { let n = arr.len(); for i in 0..n { let mut min_idx = i; for j in i + 1..n { if arr[j] < arr[min_idx] { min_idx = j; } } arr.swap(i, min_idx); } } // Benchmarking Function fn benchmark(sort_fn: fn(&mut [i32]), arr: &mut [i32], name: &str) { let start = Instant::now(); sort_fn(arr); let duration = start.elapsed(); println!("{}: {:?}", name, duration); } fn main() { let mut rng = rand::thread_rng(); let sizes = [1000, 5000, 10000]; for &size in &sizes { println!("\nArray size: {}", size); // Random array let mut arr: Vec<i32> = (0..size).collect(); arr.shuffle(&mut rng); benchmark(insertion_sort, &mut arr.clone(), "Insertion Sort (random)"); benchmark(selection_sort, &mut arr.clone(), "Selection Sort (random)"); // Sorted array arr.sort(); benchmark(insertion_sort, &mut arr.clone(), "Insertion Sort (sorted)"); benchmark(selection_sort, &mut arr.clone(), "Selection Sort (sorted)"); // Reverse sorted array arr.reverse(); benchmark(insertion_sort, &mut arr.clone(), "Insertion Sort (reverse)"); benchmark(selection_sort, &mut arr.clone(), "Selection Sort (reverse)"); } }
Results
Here are the results from the performance test:
Array Size & Configuration | Insertion Sort | Selection Sort |
---|---|---|
1000 (Random) | 1.6 | 6.9 |
1000 (Sorted) | 0.0 | 8.6 |
1000 (Reverse) | 4.4 | 7.7 |
5000 (Random) | 45.3 | 168.9 |
5000 (Sorted) | 0.1 | 139.9 |
5000 (Reverse) | 113.8 | 170.7 |
10000 (Random) | 153.3 | 581.7 |
10000 (Sorted) | 0.2 | 579.9 |
10000 (Reverse) | 288.2 | 515.0 |
Note: Times are rounded to one decimal place for readability. Actual performance may vary slightly due to system-specific factors.
Analysis of Results:
- Insertion Sort:
- Best on Sorted Arrays: Insertion Sort outperforms Selection Sort by a significant margin on sorted data due to its O(n) best-case complexity. For example, on a sorted array of size 10,000, Insertion Sort took only 0.168 ms, compared to 579.903 ms for Selection Sort.
- Random Data: On random arrays, Insertion Sort consistently outperforms Selection Sort at all sizes. For instance, on a random array of size 5,000, Insertion Sort took 45.295 ms, while Selection Sort took 168.878 ms.
- Reverse-Sorted Data: Insertion Sort performs worse on reverse-sorted data, but it still beats Selection Sort. On a reverse-sorted array of size 10,000, Insertion Sort took 288.175 ms, compared to 514.951 ms for Selection Sort.
- Selection Sort:
- Consistent Performance: Selection Sort has O(n²) time complexity regardless of the input order. It performs consistently across all data types (random, sorted, and reverse-sorted) but is significantly slower than Insertion Sort, especially on sorted arrays.
- Worst on Sorted Arrays: Selection Sort is particularly slow on sorted arrays, where its O(n²) behaviour leads to unnecessary comparisons. As seen in the result for a sorted array of size 10,000, it took 579.903 ms.
- Overall Performance:
- Insertion Sort is the clear winner in nearly all cases, especially for sorted or nearly sorted data. Due to its consistent O(n²) complexity, selection Sort is inefficient for large datasets, particularly when the data is already sorted or partially sorted.
Conclusions
- Insertion Sort should be your preferred choice when the array is small or mostly sorted due to its O(n) best-case performance and overall faster runtime.
- Although conceptually simple, Selection Sort performs poorly compared to Insertion Sort in most cases and should generally be avoided for larger datasets.
When to Use Insertion Sort
- Small Data Sets: Insertion Sort is highly efficient for small arrays, typically those with fewer than 20 elements. Its simplicity and low overhead make it a good choice in these scenarios.
- Nearly Sorted Data: When dealing with data that is already mostly in order, Insertion Sort’s best-case O(n) time complexity makes it exceptionally fast.
- Online Sorting: In situations where you need to sort a continuous stream of incoming data in real time, Insertion Sort can be effective for maintaining a sorted list as new elements arrive.
- As part of Hybrid Algorithms: Many efficient sorting implementations, like Timsort, use Insertion Sort for small subarrays within a more complex algorithm.
Limitations
- Large Data Sets: For larger arrays, Insertion Sort’s average and worst-case O(n²) time complexity make it inefficient compared to algorithms like Quick Sort, Merge Sort, or Heap Sort.
- Randomly Ordered Data: Insertion Sort’s performance degrades significantly on large, unsorted datasets.
Comparison with Other Algorithms
While our benchmarks showed Insertion Sort outperforming Selection Sort in most cases, it’s important to note that more advanced sorting algorithms generally outperform both algorithms for larger datasets.
For general-purpose sorting, especially with larger or randomly ordered datasets, algorithms like Quick Sort, Merge Sort, or Heap Sort are usually preferred due to their O(n log n) average-case time complexity. Many standard library sorting functions, like Python’s sorted()
or Rust’s sort()
, use optimized hybrid algorithms that include Insertion Sort for small subarrays, combining the benefits of different techniques.
Final Thoughts
Insertion Sort is a valuable algorithm to understand and have in your toolkit. Its simplicity makes it easy to implement and reason about, and it performs exceptionally well in its niche use cases. However, it’s not a one-size-fits-all solution.
In practice, the choice of sorting algorithm depends on various factors, including the size of the dataset, its initial order, memory constraints, and the specific requirements of your application. Modern programming language standard libraries often implement sophisticated hybrid sorting algorithms that leverage the strengths of multiple sorting techniques, including Insertion Sort for small subarrays.
Understanding Insertion Sort and its characteristics provides insight into a fundamental sorting technique and helps in appreciating the design decisions behind more complex sorting implementations. Insertion Sort is generally faster, particularly on sorted or nearly sorted data, where it performs significantly better. The benchmark results clearly show that Insertion Sort outperforms Selection Sort across most configurations, making it the preferred choice for most use cases, especially when dealing with small or nearly sorted data.
Congratulations on reading to the end of the tutorial!
Read the following articles to learn how to implement Insertion Sort:
In Python – How to do Insertion Sort in Python
In C++ – How to Do Insertion Sort in C++
In JavaScript – How to do Insertion Sort in JavaScript
In Java – How to do Insertion Sort in Java
Have fun and happy researching!
Suf is a senior advisor in data science with deep expertise in Natural Language Processing, Complex Networks, and Anomaly Detection. Formerly a postdoctoral research fellow, he applied advanced physics techniques to tackle real-world, data-heavy industry challenges. Before that, he was a particle physicist at the ATLAS Experiment of the Large Hadron Collider. Now, he’s focused on bringing more fun and curiosity to the world of science and research online.