Insertion sort is a simple yet effective algorithm for sorting small datasets. It works similarly to how you might sort playing cards in your hands—picking one card at a time and inserting it into its correct position relative to the other cards. In this blog post, we will go over the implementation of insertion sort in Python, with a clear code example and a diagram to illustrate the sorting process.
What is Insertion Sort?
Insertion sort is a comparison-based algorithm that builds a sorted list one element at a time. At each step, it takes one element from the unsorted portion of the list and inserts it into the correct position within the sorted portion.
How Insertion Sort Works:
- Divide the list into two parts: a sorted and an unsorted section.
- Take the first element from the unsorted section and compare it to the elements in the sorted section.
- Insert the element into the correct position by shifting elements in the sorted section to make space if necessary.
- Repeat the process until all elements are sorted.
Time Complexity:
- Best Case: O(n) — when the list is already sorted.
- Worst Case: O(n²) — when the list is sorted in reverse order.
Pseudocode for Insertion Sort
function InsertionSort(arr): n = length(arr) # Outer loop to iterate over unsorted portion of array for i from 1 to n-1: key = arr[i] j = i - 1 # Move elements of arr[0..i-1], that are greater than key, to one position ahead while j >= 0 and arr[j] > key: arr[j + 1] = arr[j] j = j - 1 # Insert the key in its correct position arr[j + 1] = key
Explanation of the Pseudocode:
- Outer Loop (
i
): Starts from the second element (i = 1
) and iterates through the array. This loop determines the current element to be inserted into the sorted portion of the array. - Key Element (
key
): The element at positioni
(unsorted part of the array) is temporarily stored inkey
, and the insertion process begins by comparing it with the elements in the sorted part of the array. - Inner Loop (
j
): Moves through the sorted portion of the array, comparing elements that are greater than thekey
. Elements that are greater are shifted one position to the right to create space for inserting thekey
. - Insertion: Once the correct position for the
key
is found, it is placed in its correct position (arr[j + 1]
). - Repeat: The process repeats for every element in the unsorted portion of the array until the entire array is sorted.
How It Works:
- Insertion Sort works similarly to how people arrange playing cards in their hands. You pick up one card at a time and insert it in its correct position relative to the cards already in your hand.
- The algorithm builds a sorted subarray at the beginning of the list and gradually inserts each new element into this sorted subarray.
Insertion Sort Algorithm in Python
Here’s a step-by-step implementation of the insertion sort algorithm in Python:
def insertion_sort(arr): # Traverse through 1 to len(arr) for i in range(1, len(arr)): key = arr[i] # Move elements of arr[0:i-1], that are greater than key, # to one position ahead of their current position j = i - 1 while j >= 0 and arr[j] > key: arr[j + 1] = arr[j] j -= 1 # Insert the key into its correct position arr[j + 1] = key # Example usage: data = [12, 11, 13, 5, 6] insertion_sort(data) print("Sorted array:", data)
Output:
Sorted array: [5, 6, 11, 12, 13]
Diagram Explanation of Insertion Sort:
To better understand how the algorithm works, let’s visualize the sorting process step by step for the input array [12, 11, 13, 5, 6]
.
Step 1: [12, 11, 13, 5, 6] (Initial array) [12] | [11, 13, 5, 6] (12 is in sorted part) Step 2: [12] becomes [11, 12] by inserting 11 into the correct position. [11, 12] | [13, 5, 6] Step 3: 13 is already larger than 12, so no changes. [11, 12, 13] | [5, 6] Step 4: 5 is less than 13, 12, and 11, so shift all three and insert 5. [5, 11, 12, 13] | [6] Step 5: 6 is less than 13 and 12, but greater than 5 and 11. So shift 12 and 13, and insert 6. [5, 6, 11, 12, 13] (Final sorted array)
The array is sorted in ascending order by inserting each unsorted element into its proper place in the sorted portion.
Key Points to Remember:
- In-place sorting: Insertion sort does not require additional memory since it operates directly on the input list.
- Stable algorithm: It preserves the relative order of equal elements.
- Efficient for small datasets: It is less efficient for large datasets compared to more advanced algorithms like Quick Sort or Merge Sort but works well for small or nearly sorted data.
Limitations of Insertion Sort
While insertion sort is simple and intuitive, it has some notable limitations:
- Inefficient for large datasets: With a time complexity of O(n²) in the worst case, insertion sort becomes slow when dealing with large lists.
- Not suited for complex sorting: Algorithms like quicksort or mergesort are more efficient for handling larger and more complex datasets.
- High number of comparisons: Insertion sort compares elements multiple times, making it less efficient compared to algorithms that reduce comparisons.
How Insertion Sort is Used in TimSort
Insertion Sort plays a key role in TimSort by efficiently sorting small subsections of the data, called runs. TimSort is a hybrid sorting algorithm that merges the benefits of Insertion Sort and Merge Sort to handle real-world data more effectively. Here’s how Insertion Sort is integrated into TimSort:
- Sorting Small Runs: TimSort divides the data into small runs, typically of size 32 or 64. Insertion Sort is applied to these small runs because it is highly efficient for sorting small, nearly sorted arrays. Insertion Sort’s time complexity is O(n²), but its overhead is minimal for small datasets, and it can quickly place elements in their correct positions with fewer comparisons.
- Taking Advantage of Localized Order: In real-world datasets, runs are often partially sorted or nearly sorted. Insertion Sort excels in these scenarios because it minimizes the number of operations needed to sort such data. By leveraging the existing order within each run, TimSort benefits from the best-case O(n) performance of Insertion Sort when applied to almost-sorted runs.
- Preparation for Merging: After each run is sorted using Insertion Sort, TimSort merges these runs together using Merge Sort. The efficiency of sorting small runs with Insertion Sort ensures that when merging runs, there is less overhead because each run is already optimally ordered.
In summary, Insertion Sort helps TimSort by efficiently handling small segments of data, making it a crucial part of TimSort’s ability to handle real-world, partially ordered datasets.
Conclusion
Insertion sort is an intuitive and easy-to-implement sorting algorithm that can be very useful when dealing with small lists or nearly sorted data. In this post, we covered the insertion sort algorithm in Python with a step-by-step breakdown and a diagram to explain the process.
For large datasets, though, you may want to explore more efficient algorithms like merge sort or quick sort, which have better average-case performance.
Congratulations on reading to the end of this tutorial!
Read the following articles to learn how to implement Insertion Sort:
In C++ – How to Do Insertion Sort in C++
In Rust – How to do Insertion Sort in Rust
In Java – How to do Insertion Sort in Java
For further reading on sorting algorithms in Python, go to the articles:
- How to do Quick Sort in Python (With Code Example and Diagram)
- How to Do Bubble Sort in Python
- How to Do Selection Sort in Python
- How to do Merge Sort in Python
- How to do Counting Sort in Python
- How to Do Radix Sort in Python
- How to Do Bucket Sort in Python
- How to Do Heap Sort in Python
- How to Do Pigeonhole Sort in Python
- How To Do Comb Sort in Python
- How to Do Shell Sort in Python
- How To Do Cocktail Sort in Python
- How To Do TimSort in Python
Go to the online courses page on Python to learn more about Python for data science and machine learning.
Have fun and happy researching!
Suf is a senior advisor in data science with deep expertise in Natural Language Processing, Complex Networks, and Anomaly Detection. Formerly a postdoctoral research fellow, he applied advanced physics techniques to tackle real-world, data-heavy industry challenges. Before that, he was a particle physicist at the ATLAS Experiment of the Large Hadron Collider. Now, he’s focused on bringing more fun and curiosity to the world of science and research online.