Introduction
Merge Sort is a classic and highly efficient sorting algorithm based on the divide and conquer paradigm. It splits an array into smaller subarrays, recursively sorts those subarrays, and then merges them back together in a sorted manner. Merge Sort guarantees a time complexity of O(n log n) even in the worst-case scenario, making it one of the most reliable and predictable sorting algorithms.
In this blog post, we’ll explore how Merge Sort works, break down its implementation in Java, and compare its performance to other common sorting algorithms like Quick Sort and Bubble Sort.
What Is Merge Sort?
Merge Sort was developed to efficiently sort large datasets. It splits the input array into two halves, sorts each half, and then merges the two sorted halves into a single sorted array. This process continues recursively until the array is fully sorted. It’s handy when stability (i.e., preserving the order of equal elements) is important or when working with large datasets.
Merge Sort is known for its consistent time complexity of O(n log n) across best, average, and worst cases, making it superior to other algorithms like Quick Sort regarding worst-case performance.
Here’s a fleshed-out blog post on Merge Sort in Java, following a similar structure to the TimSort example:
How Merge Sort Works
The Merge Sort algorithm involves three main steps:
- Divide: Recursively split the input array into two halves until each subarray has one element.
- Conquer: Recursively sort the two halves by dividing them further.
- Combine: Merge the sorted halves back together to form the final sorted array.
Key Concepts:
- Divide and Conquer: The array is recursively divided into smaller subarrays, which are then sorted and merged.
- Merging: During the merge step, two sorted subarrays are combined into one sorted array by comparing elements and inserting them in the correct order.
To see how Merge Sort works, visit our free Sorting Algorithm Visualizer.
Pseudocode for Merge Sort
MergeSort(arr[], l, r): if l < r: 1. Find the middle point m = l + (r - l) / 2 2. Recursively sort the first half: MergeSort(arr, l, m) 3. Recursively sort the second half: MergeSort(arr, m + 1, r) 4. Merge the two halves: merge(arr, l, m, r) merge(arr[], l, m, r): 1. Create temporary arrays L[] and R[] 2. Copy data to temporary arrays L[] and R[] 3. Merge the temporary arrays back into arr[l..r] 4. Copy any remaining elements of L[] and R[] to arr[]
Java Implementation of Merge Sort
Here’s the complete Java implementation of Merge Sort:
public class MergeSort { // Main function that sorts arr[l..r] using merge() public static void mergeSort(int[] arr, int l, int r) { if (l < r) { // Find the middle point int m = l + (r - l) / 2; // Sort first and second halves mergeSort(arr, l, m); mergeSort(arr, m + 1, r); // Merge the sorted halves merge(arr, l, m, r); } } // Merge two halves function public static void merge(int[] arr, int l, int m, int r) { // Find sizes of two subarrays to be merged int n1 = m - l + 1; int n2 = r - m; // Create temp arrays int[] L = new int[n1]; int[] R = new int[n2]; // Copy data to temp arrays for (int i = 0; i < n1; ++i) L[i] = arr[l + i]; for (int j = 0; j < n2; ++j) R[j] = arr[m + 1 + j]; // Merge the temp arrays int i = 0, j = 0, k = l; while (i < n1 && j < n2) { if (L[i] <= R[j]) { arr[k] = L[i]; i++; } else { arr[k] = R[j]; j++; } k++; } // Copy remaining elements of L[] while (i < n1) { arr[k] = L[i]; i++; k++; } // Copy remaining elements of R[] while (j < n2) { arr[k] = R[j]; j++; k++; } } // Helper function to print the array public static void printArray(int[] arr) { for (int num : arr) { System.out.print(num + " "); } System.out.println(); } // Main method public static void main(String[] args) { int[] arr = {38, 27, 43, 3, 9, 82, 10, 39, 44, 73}; System.out.println("Given Array:"); printArray(arr); mergeSort(arr, 0, arr.length - 1); System.out.println("\nSorted Array:"); printArray(arr); } }
Step-by-Step Example: How Merge Sort Works on a Large Array
To understand how Merge Sort operates, let’s walk through a step-by-step example using the array:
Array: [38, 27, 43, 3, 9, 82, 10, 39, 44, 73]
Step 1: Dividing the Array
The array is recursively divided in half until each subarray contains one element:
Initial Array: [38, 27, 43, 3, 9, 82, 10, 39, 44, 73] Split into two halves: Left: [38, 27, 43, 3, 9] Right: [82, 10, 39, 44, 73]
Keep dividing until each subarray contains one element:
Left: [38, 27, 43, 3, 9] -> [38, 27] -> [38], [27] -> [43] -> [3, 9] -> [3], [9] Right: [82, 10, 39, 44, 73] -> [82, 10] -> [82], [10] -> [39] -> [44, 73] -> [44], [73]
Step 2: Merging the Halves
Now, we merge the single-element arrays in sorted order:
- Merge
[38]
and[27]
→[27, 38]
- Merge
[3]
and[9]
→[3, 9]
- Merge
[82]
and[10]
→[10, 82]
- Merge
[44]
and[73]
→[44, 73]
At this point, the array looks like:
Left: [27, 38], [43], [3, 9] Right: [10, 82], [39], [44, 73]
Next, we merge the sorted subarrays:
- Merge
[27, 38]
and[43]
→[27, 38, 43]
- Merge
[3, 9]
with[27, 38, 43]
→[3, 9, 27, 38, 43]
- Merge
[10, 82]
and[39]
→[10, 39, 82]
- Merge
[10, 39, 82]
with[44, 73]
→[10, 39, 44, 73, 82]
Finally, merge the two halves:
- Merge
[3, 9, 27, 38, 43]
with[10, 39, 44, 73, 82]
→[3, 9, 10, 27, 38, 39, 43, 44, 73, 82]
Final Sorted Array:
[3, 9, 10, 27, 38, 39, 43, 44, 73, 82]
Performance Test: Merge Sort vs Quick Sort vs Bubble Sort
Now that we’ve reviewed how Merge Sort works, let’s compare its performance to other standard algorithms: Quick Sort and Bubble Sort. We’ll test the algorithms on random, sorted, reverse-sorted, and partially sorted arrays.
Here’s the benchmarking code in Java, including partially sorted arrays:
import java.util.Arrays; import java.util.Random; public class SortingBenchmark { public static void main(String[] args) { // Define the array sizes to test int[] sizes = {1000, 5000, 10000}; Random random = new Random(); // Loop through each array size for (int size : sizes) { // Generate random arrays of the given size int[] randomArray = random.ints(size, 0, 10000).toArray(); // Random array int[] sortedArray = Arrays.copyOf(randomArray, randomArray.length); // Sorted array Arrays.sort(sortedArray); // Sort the array to make it "Sorted" int[] reverseArray = Arrays.copyOf(sortedArray, sortedArray.length); // Reverse-sorted array reverseArray(reverseArray); // Reverse the sorted array int[] partiallySortedArray = generatePartiallySortedArray(size); // Partially sorted array System.out.println("\nArray size: " + size); // Test Merge Sort System.out.println("Merge Sort:"); benchmark("Random", randomArray, () -> mergeSort(randomArray)); benchmark("Sorted", sortedArray, () -> mergeSort(sortedArray)); benchmark("Reverse", reverseArray, () -> mergeSort(reverseArray)); benchmark("Partially Sorted", partiallySortedArray, () -> mergeSort(partiallySortedArray)); // Test Quick Sort System.out.println("Quick Sort:"); benchmark("Random", randomArray, () -> quickSort(randomArray)); benchmark("Sorted", sortedArray, () -> quickSort(sortedArray)); benchmark("Reverse", reverseArray, () -> quickSort(reverseArray)); benchmark("Partially Sorted", partiallySortedArray, () -> quickSort(partiallySortedArray)); // Test Bubble Sort (for comparison) System.out.println("Bubble Sort:"); benchmark("Random", randomArray, () -> bubbleSort(randomArray)); benchmark("Sorted", sortedArray, () -> bubbleSort(sortedArray)); benchmark("Reverse", reverseArray, () -> bubbleSort(reverseArray)); benchmark("Partially Sorted", partiallySortedArray, () -> bubbleSort(partiallySortedArray)); } } // Reverse an array (used to create reverse sorted array) public static void reverseArray(int[] arr) { int n = arr.length; for (int i = 0; i < n / 2; i++) { int temp = arr[i]; arr[i] = arr[n - i - 1]; arr[n - i - 1] = temp; } } // Generate a partially sorted array where half the array is sorted public static int[] generatePartiallySortedArray(int size) { Random random = new Random(); int[] arr = random.ints(size, 0, 10000).toArray(); Arrays.sort(arr, 0, size / 2); // Sort only the first half of the array return arr; } // Benchmarking method to measure and print the time taken by each sorting algorithm public static void benchmark(String arrayType, int[] arr, Runnable sortingAlgorithm) { int[] arrCopy = Arrays.copyOf(arr, arr.length); // Create a copy to avoid modifying the original array long startTime = System.nanoTime(); // Start the timer sortingAlgorithm.run(); // Run the sorting algorithm long endTime = System.nanoTime(); // End the timer System.out.println(arrayType + ": " + (endTime - startTime) / 1e6 + " ms"); // Print the time taken in milliseconds } // Merge Sort: Initialize recursive merge sort call public static void mergeSort(int[] arr) { mergeSort(arr, 0, arr.length - 1); // Call the recursive merge sort with start and end indices } // Recursive Merge Sort function public static void mergeSort(int[] arr, int l, int r) { if (l < r) { // Find the middle point int m = l + (r - l) / 2; // Recursively sort both halves mergeSort(arr, l, m); mergeSort(arr, m + 1, r); // Merge the sorted halves merge(arr, l, m, r); } } // Merge function to combine two sorted subarrays into one public static void merge(int[] arr, int l, int m, int r) { // Sizes of two subarrays int n1 = m - l + 1; int n2 = r - m; // Temporary arrays for the two subarrays int[] L = new int[n1]; int[] R = new int[n2]; // Copy data into temporary arrays for (int i = 0; i < n1; ++i) L[i] = arr[l + i]; for (int j = 0; j < n2; ++j) R[j] = arr[m + 1 + j]; // Initial indexes of the two subarrays int i = 0, j = 0, k = l; // Merge the two sorted subarrays back into arr[] while (i < n1 && j < n2) { if (L[i] <= R[j]) { arr[k] = L[i]; i++; } else { arr[k] = R[j]; j++; } k++; } // Copy any remaining elements of L[] while (i < n1) { arr[k] = L[i]; i++; k++; } // Copy any remaining elements of R[] while (j < n2) { arr[k] = R[j]; j++; k++; } } // Quick Sort: Initialize recursive quick sort call public static void quickSort(int[] arr) { quickSort(arr, 0, arr.length - 1); // Call the recursive quick sort with start and end indices } // Recursive Quick Sort function public static void quickSort(int[] arr, int low, int high) { if (low < high) { // Partition the array and get the pivot index int pi = partition(arr, low, high); // Recursively sort elements before and after the partition quickSort(arr, low, pi - 1); quickSort(arr, pi + 1, high); } } // Partition function for Quick Sort public static int partition(int[] arr, int low, int high) { int pivot = arr[high]; // Set the pivot element int i = low - 1; // Index of the smaller element for (int j = low; j < high; j++) { // If current element is smaller than or equal to the pivot, swap it if (arr[j] <= pivot) { i++; swap(arr, i, j); } } // Swap the pivot element with the element at the correct position swap(arr, i + 1, high); return i + 1; } // Helper function to swap two elements in the array public static void swap(int[] arr, int i, int j) { int temp = arr[i]; arr[i] = arr[j]; arr[j] = temp; } // Bubble Sort function public static void bubbleSort(int[] arr) { int n = arr.length; for (int i = 0; i < n - 1; i++) { for (int j = 0; j < n - i - 1; j++) { // Swap if the current element is greater than the next element if (arr[j] > arr[j + 1]) { swap(arr, j, j + 1); } } } } }
Results
Array Size & Configuration | Merge Sort (ms) | Quick Sort (ms) | Bubble Sort (ms) |
---|---|---|---|
1000 (Random) | 1.390 | 0.530 | 6.841 |
1000 (Sorted) | 0.142 | 2.922 | 2.838 |
1000 (Reverse) | 0.121 | 3.829 | 1.257 |
1000 (Partially Sorted) | 0.147 | 0.223 | 1.443 |
5000 (Random) | 0.906 | 0.947 | 36.880 |
5000 (Sorted) | 0.855 | 36.624 | 6.701 |
5000 (Reverse) | 0.911 | 15.085 | 20.397 |
5000 (Partially Sorted) | 0.884 | 0.539 | 23.167 |
10000 (Random) | 1.964 | 1.132 | 147.377 |
10000 (Sorted) | 1.072 | 47.479 | 20.824 |
10000 (Reverse) | 1.334 | 53.569 | 74.993 |
10000 (Partially Sorted) | 1.657 | 0.838 | 108.824 |
Note: Times are rounded to three decimal places for readability. Actual performance may vary slightly due to system-specific factors.
Key Takeaways
- Merge Sort and Quick Sort are consistently faster than Bubble Sort:
- Across all test cases and array sizes, Merge Sort and Quick Sort outperform Bubble Sort, especially for large arrays. Bubble Sort‘s O(n²) time complexity results in very slow performance as the array size increases.
- For example, with 10,000 random elements, Merge Sort took 1.964 ms, Quick Sort took 1.132 ms, and Bubble Sort took 147.377 ms.
- Quick Sort struggles with sorted and reverse-sorted arrays:
- Quick Sort‘s performance significantly degrades on sorted and reverse-sorted arrays due to its basic pivot selection strategy (the last element is the pivot). It suffers from worst-case O(n²) performance in these cases.
- For example, on a 10000 reverse-sorted array, Quick Sort took 53.569 ms, while Merge Sort remained efficient at 1.334 ms.
- Merge Sort’s stable performance:
- Merge Sort consistently performs well across all configurations (random, sorted, reverse-sorted, and partially sorted). Its O(n log n) time complexity allows it to handle both small and large arrays efficiently.
- Merge Sort’s best performance was 0.121 ms for 1000 reverse-sorted elements and 0.147 ms for 1000 partially sorted elements.
- Bubble Sort is inefficient for large arrays:
- Bubble Sort‘s performance drops dramatically as the array size increases. For example, on a 10000 random array, Bubble Sort took 147.377 ms, making it unsuitable for large datasets.
- Quick Sort excels with partially sorted arrays:
- Quick Sort performs exceptionally well with partially sorted arrays, showing faster times than Merge Sort and Bubble Sort in these cases.
- For example, with 10,000 partially sorted elements, Quick Sort took just 0.838 ms, faster than Merge Sort, which took 1.657 ms.
Summary
- Merge Sort is a reliable and consistent algorithm across all test cases, making it an excellent choice for scenarios where array order isn’t known beforehand.
- Quick Sort is generally fast but can struggle with sorted or reverse-sorted arrays due to poor pivot selection. It excels with partially sorted arrays.
- Bubble Sort should be avoided for large arrays due to its poor scalability, though it performs decently on small or mostly sorted arrays.
Pros and Cons of Merge Sort
Pros:
- Guaranteed O(n log n) Time Complexity: Merge Sort performs consistently, even in the worst-case scenario.
- Stable Sorting: It preserves the relative order of elements with equal values.
- Best for Large Datasets: Its predictable time complexity makes it ideal for sorting large datasets.
Cons:
- Space Complexity: Merge Sort requires extra space for the temporary arrays used during merging.
- Recursion Overhead: The recursive nature of Merge Sort adds overhead, particularly for small arrays.
When to Use Merge Sort
- Stability is Required: Use Merge Sort when stability is important, i.e., when the order of equal elements should be preserved.
- Sorting Large Datasets: Merge Sort is particularly well-suited for large datasets because of its predictable O(n log n) performance.
- Performance in the Worst Case: If you expect worst-case inputs like already sorted or reverse-sorted arrays, Merge Sort performs more consistently than Quick Sort.
Conclusion
Merge Sort’s predictable performance across various arrays makes it an excellent choice for large datasets or scenarios where worst-case performance is a concern. Its efficiency with sorted and reverse-sorted arrays and consistent O(n log n) time complexity outshine Quick Sort’s vulnerability to poor pivot selection in such cases. Merge Sort outperforms Bubble Sort in all configurations and is more reliable than Quick Sort when dealing with reverse-sorted or partially sorted arrays. While Quick Sort may have better average-case performance on random arrays, Merge Sort’s stability, guaranteed worst-case performance, and memory efficiency make it a highly reliable choice for handling diverse datasets.
Congratulations on reading to the end of this tutorial!
To implement Merge Sort in C++, go to the article How To Do Merge Sort in C++.
To implement Merge Sort in Rust, go to the article How To Do Merge Sort in Rust.
Have fun and happy researching!
Suf is a senior advisor in data science with deep expertise in Natural Language Processing, Complex Networks, and Anomaly Detection. Formerly a postdoctoral research fellow, he applied advanced physics techniques to tackle real-world, data-heavy industry challenges. Before that, he was a particle physicist at the ATLAS Experiment of the Large Hadron Collider. Now, he’s focused on bringing more fun and curiosity to the world of science and research online.