TimSort is a hybrid sorting algorithm derived from Merge Sort and Insertion Sort. It was designed to perform well on real-world data, often containing ordered sequences. TimSort is the default sorting algorithm used in Python and JavaScript (in modern engines like V8), and it is optimized for datasets that are already partially sorted or have patterns like runs (consecutive elements already in order).
In this blog post, we’ll explore how TimSort works, its implementation in JavaScript, and the key concepts that make it efficient. We’ll also discuss its time and space complexities and when you should use TimSort.
What is TimSort?
TimSort was developed by Tim Peters in 2002 for Python and is now widely used in JavaScript’s Array.prototype.sort()
method in most modern engines. The algorithm identifies small segments of the array, called runs, that are already sorted. It then uses Insertion Sort to sort small arrays and Merge Sort to combine the runs. This makes it highly efficient when the data has a lot of existing order or partial sorting.
TimSort is designed to minimize the number of comparisons and swaps, making it faster than typical comparison-based algorithms like Quick Sort or Merge Sort in many real-world applications.
procedure TimSort(arr) n = length of arr minRun = calculateMinRun(n) # Step 1: Sort small sub-arrays using Insertion Sort for i = 0 to n - 1 in steps of minRun do InsertionSort(arr, i, min(i + minRun - 1, n - 1)) # Step 2: Merge sorted runs using Merge Sort size = minRun while size < n do for left = 0 to n - 1 in steps of 2 * size do mid = min(left + size - 1, n - 1) right = min(left + 2 * size - 1, n - 1) if mid < right: Merge(arr, left, mid, right) size = 2 * size
Step-by-Step Explanation:
- Step 1: The array is divided into smaller runs, and each run is sorted using Insertion Sort. The minimum size of each run is determined dynamically.
- Step 2: The sorted runs are merged using Merge Sort, gradually increasing the size of the sorted portions until the entire array is sorted.
Time Complexity of TimSort
- Best Case: O(n)—TimSort performs best with linear time complexity when the input array is already sorted.
- Average Case: O(n log n) – TimSort performs at O(n log n) for most inputs, similar to other efficient sorting algorithms.
- Worst Case: O(n log n) – Even in the worst-case scenario, TimSort guarantees O(n log n) time complexity due to its merging process.
Space Complexity of TimSort
- Space Complexity: O(n) – TimSort requires additional space for merging, as it uses a temporary array during the merge step.
JavaScript Implementation of TimSort
Here’s the JavaScript code to implement a simplified version of TimSort:
// Function to perform Insertion Sort on a subarray function insertionSort(arr, left, right) { for (let i = left + 1; i <= right; i++) { let key = arr[i]; let j = i - 1; while (j >= left && arr[j] > key) { arr[j + 1] = arr[j]; j--; } arr[j + 1] = key; } } // Function to merge two sorted subarrays function merge(arr, left, mid, right) { let len1 = mid - left + 1; let len2 = right - mid; let leftArr = new Array(len1); let rightArr = new Array(len2); for (let i = 0; i < len1; i++) leftArr[i] = arr[left + i]; for (let i = 0; i < len2; i++) rightArr[i] = arr[mid + 1 + i]; let i = 0, j = 0, k = left; while (i < len1 && j < len2) { if (leftArr[i] <= rightArr[j]) { arr[k] = leftArr[i]; i++; } else { arr[k] = rightArr[j]; j++; } k++; } while (i < len1) arr[k++] = leftArr[i++]; while (j < len2) arr[k++] = rightArr[j++]; } // Utility function to calculate the minimum run size function calculateMinRun(n) { let r = 0; while (n >= 64) { r |= n & 1; n >>= 1; } return n + r; } // Main TimSort function function timSort(arr) { const n = arr.length; const minRun = calculateMinRun(n); // Step 1: Sort small runs using Insertion Sort for (let i = 0; i < n; i += minRun) { insertionSort(arr, i, Math.min(i + minRun - 1, n - 1)); } // Step 2: Merge sorted runs using Merge Sort for (let size = minRun; size < n; size *= 2) { for (let left = 0; left < n; left += 2 * size) { const mid = Math.min(left + size - 1, n - 1); const right = Math.min(left + 2 * size - 1, n - 1); if (mid < right) { merge(arr, left, mid, right); } } } } // Example usage with a larger array const arr = [20, 45, 17, 23, 60, 50, 99, 15, 30, 10, 85, 33, 75, 40, 5, 55]; console.log("Original array:", arr); timSort(arr); console.log("Sorted array:", arr);
Output:
Original array: [ 20, 45, 17, 23, 60, 50, 99, 15, 30, 10, 85, 33, 75, 40, 5, 55 ] Sorted array: [ 5, 10, 15, 17, 20, 23, 30, 33, 40, 45, 50, 55, 60, 75, 85, 99 ]
Expanded Step-by-Step Walkthrough of TimSort
Let’s walk through the TimSort algorithm using the array:[20, 45, 17, 23, 60, 50, 99, 15, 30, 10, 85, 33, 75, 40, 5, 55]
.
We’ll detail the steps, covering how TimSort sorts smaller runs using Insertion Sort and then merges these sorted runs using Merge Sort.
Step 1: Calculate the Minimum Run Size
The first step in TimSort is to determine the minimum run size. This is the size of the smallest subarrays that will be sorted using Insertion Sort. In our example, the array has 16 elements, so the minimum run size will be calculated dynamically using the following function:
function calculateMinRun(n) { let r = 0; while (n >= 64) { r |= n & 1; n >>= 1; } return n + r; }
For our array of size 16
, this function will return 16
, meaning the entire array will be divided into runs of approximately 16
elements. Since our array is already small, it won’t be broken down further.
Step 2: Sort Small Runs Using Insertion Sort
Next, TimSort applies Insertion Sort to these small runs. In our case, since the array size is smaller than 64, we can treat the entire array as a single run and apply Insertion Sort to it.
Pass 1 (Initial Array):
[20, 45, 17, 23, 60, 50, 99, 15, 30, 10, 85, 33, 75, 40, 5, 55]
Insertion Sort starts by sorting elements from the second element onwards:
- Compare
20
and45
– no swap is needed; the array remains the same. - Compare
45
and17
– swap17
with45
and move17
into its correct position relative to20
. Array becomes:[17, 20, 45, 23, 60, 50, 99, 15, 30, 10, 85, 33, 75, 40, 5, 55]
- Compare
45
and23
– swap23
with45
and move23
into its correct position. Array becomes:[17, 20, 23, 45, 60, 50, 99, 15, 30, 10, 85, 33, 75, 40, 5, 55]
- Continue this process for each element in the array. After each comparison, the current element is shifted left into its correct position relative to the sorted part of the array.
Pass 2 (Array after Insertion Sort):
After all the comparisons, Insertion Sort fully sorts the array as: [5, 10, 15, 17, 20, 23, 30, 33, 40, 45, 50, 55, 60, 75, 85, 99]
Now, we have a single sorted run, and since the array size is relatively small, there’s no need for additional passes.
Step 3: Merge Runs Using Merge Sort
TimSort would typically continue to merge small sorted runs into larger ones using Merge Sort. However, no merging is necessary since our array is already sorted after the first pass with Insertion Sort. TimSort would repeatedly merge adjacent runs for larger arrays with more runs until the entire array is fully sorted.
If we had multiple runs, Merge Sort would:
- Compare and merge adjacent runs into larger sorted subarrays.
- Continue merging until the entire array is sorted.
Let’s break down what would happen with multiple runs:
Example of Merging Two Sorted Runs:
Assume we have two sorted runs:[5, 17, 23, 50]
and [10, 20, 30, 45]
.
Merge Sort will compare elements from both runs and merge them into a single sorted array:
- Compare
5
and10
→ take5
. - Compare
17
and10
→ take10
. - Compare
17
and20
→ take17
. - Continue this process until all elements from both runs are merged into a single sorted run.
The merged result:[5, 10, 17, 20, 23, 30, 45, 50]
Step 4: Final Sorted Array
After the Insertion Sort step, our entire array is already sorted as: [5, 10, 15, 17, 20, 23, 30, 33, 40, 45, 50, 55, 60, 75, 85, 99]
No further merging is required, and the process is complete.
Pros and Cons of Using TimSort
Pros:
- Efficient for Real-World Data: TimSort is particularly fast for datasets that contain ordered or nearly ordered sequences, making it suitable for real-world applications.
- Stable Sort: TimSort preserves the relative order of elements with equal values, making it a stable sorting algorithm.
- Adaptive to Input: TimSort is adaptive, meaning it performs better on data that is already partially sorted.
Cons:
- Slightly More Complex: TimSort is more complex to implement than basic algorithms like Quick Sort or Insertion Sort.
- Space Overhead: TimSort uses additional space for the merging process, which may not be ideal in memory-constrained environments.
When to Use TimSort
TimSort is an ideal sorting algorithm when:
- You have partially sorted or nearly sorted data: Since TimSort is designed to handle ordered sequences efficiently, it’s perfect for real-world data that often has some ordering.
- You need a stable sorting algorithm: TimSort ensures that the relative order of elements is preserved, making it suitable for datasets where stability matters.
- You need an adaptive sorting algorithm: TimSort adapts to the existing order in the data, making it faster in practice than other O(n log n) algorithms.
Conclusion
TimSort is a highly efficient and adaptive sorting algorithm designed for real-world data. By combining the strengths of Insertion Sort and Merge Sort, TimSort performs exceptionally well on datasets that contain ordered sequences. Its stability, adaptability, and efficient handling of partially sorted data make it valuable in any developer’s toolbox.
Congratulations on reading to the end of the tutorial!
Read the following articles to learn how to implement TimSort:
In Python – How to do TimSort in Python
In C++ – How to Do TimSort in C++
in Rust – How to do TimSort in Rust
Have fun and happy researching!
Suf is a senior advisor in data science with deep expertise in Natural Language Processing, Complex Networks, and Anomaly Detection. Formerly a postdoctoral research fellow, he applied advanced physics techniques to tackle real-world, data-heavy industry challenges. Before that, he was a particle physicist at the ATLAS Experiment of the Large Hadron Collider. Now, he’s focused on bringing more fun and curiosity to the world of science and research online.