How to do Merge Sort in Python

Introduction

Merge sort is one of the most efficient and reliable sorting algorithms, known for its divide-and-conquer approach. It is particularly useful for sorting large datasets due to its guaranteed O(n log n) time complexity, making it a popular choice in many applications. This blog post walks you through the implementation of merge sort in Python, provides a detailed visual explanation, and explores variations like bottom-up, in-place, parallel, and natural merge sort.

What is Merge Sort?

Merge sort is a divide-and-conquer sorting algorithm that recursively divides an array into smaller sub-arrays until each contains a single element. These sub-arrays are then merged back together in sorted order.

How Merge Sort Works

Divide the array into two halves recursively until you have sub-arrays with one element.
Merge the sub-arrays by comparing elements and combining them in sorted order.
Repeat until all sub-arrays are merged and the entire array is sorted.

Time Complexity

Best Case: O(n log n) — The array is split and merged efficiently.
Worst Case: O(n log n) — Performance remains consistent regardless of input order.

Space Complexity

Space Complexity: O(n), due to the need for temporary arrays to store subarrays during the merge process.
Merge Sort also uses O(log n) space for the recursion stack, but the auxiliary space required for merging dominates, making the total space complexity O(n).

Python Implementation of Merge Sort

Here’s a straightforward implementation of merge sort in Python:

def merge_sort(arr):
    if len(arr) > 1:
        mid = len(arr) // 2  # Find the middle of the array
        left_half = arr[:mid]  # Split into two halves
        right_half = arr[mid:]

        # Recursively sort both halves
        merge_sort(left_half)
        merge_sort(right_half)

        i = j = k = 0

        # Merge the sorted halves
        while i < len(left_half) and j < len(right_half):
            if left_half[i] < right_half[j]:
                arr[k] = left_half[i]
                i += 1
            else:
                arr[k] = right_half[j]
                j += 1
            k += 1

        # Check for remaining elements
        while i < len(left_half):
            arr[k] = left_half[i]
            i += 1
            k += 1

        while j < len(right_half):
            arr[k] = right_half[j]
            j += 1
            k += 1

# Example usage:
data = [12, 11, 13, 5, 6, 7]
merge_sort(data)
print("Sorted array:", data)

Output:

Sorted array: [5, 6, 7, 11, 12, 13]

Diagram Explanation of Merge Sort:

To understand merge sort visually, let’s take the example of sorting the array [12, 11, 13, 5, 6, 7]:

Step 1: [12, 11, 13, 5, 6, 7]  (Original array)

Step 2: Split into two halves:
        Left: [12, 11, 13]
        Right: [5, 6, 7]

Step 3: Recursively split each half:
        Left split: [12] | [11, 13] => [12] | [11] | [13]
        Right split: [5] | [6, 7] => [5] | [6] | [7]

Step 4: Merge sub-arrays:
        Left merge: [11, 12, 13]
        Right merge: [5, 6, 7]

Step 5: Merge the left and right halves:
        [5, 6, 7, 11, 12, 13]

Variations of Merge Sort

There are several interesting variations of merge sort, each optimizing different aspects of the algorithm for specific use cases. Let’s explore some of the most popular variations:

1. Bottom-Up Merge Sort

Bottom-up merge sort avoids recursion and iteratively merges sub-arrays of increasing size. It is useful when recursion overhead is a concern.

Bottom-Up Merge Sort Code

def bottom_up_merge_sort(arr):
    width = 1
    n = len(arr)
    while width < n:
        for i in range(0, n, 2 * width):
            left = arr[i:i+width]
            right = arr[i+width:i+2*width]
            arr[i:i+2*width] = merge(left, right)
        width *= 2

def merge(left, right):
    result = []
    i = j = 0
    while i < len(left) and j < len(right):
        if left[i] < right[j]:
            result.append(left[i])
            i += 1
        else:
            result.append(right[j])
            j += 1
    result.extend(left[i:])
    result.extend(right[j:])
    return result

# Example usage:
data = [12, 11, 13, 5, 6, 7]
bottom_up_merge_sort(data)
print("Sorted array (Bottom-Up):", data)

Output:

Sorted array (Bottom-Up): [5, 6, 7, 11, 12, 13]

2. Natural Merge Sort

Natural merge sort leverages the fact that data often contains already sorted sequences (runs). It identifies these runs and merges them, making it more efficient for datasets with naturally ordered subsequences.

Natural Merge Sort Code:

def natural_merge_sort(arr):
    runs = []
    new_run = [arr[0]]

    # Identify runs (naturally ordered subsequences)
    for i in range(1, len(arr)):
        if arr[i] >= arr[i - 1]:
            new_run.append(arr[i])
        else:
            runs.append(new_run)
            new_run = [arr[i]]
    runs.append(new_run)

    # Iteratively merge runs
    while len(runs) > 1:
        new_runs = []
        for i in range(0, len(runs), 2):
            if i + 1 < len(runs):
                new_runs.append(merge(runs[i], runs[i + 1]))
            else:
                new_runs.append(runs[i])
        runs = new_runs

    return runs[0]

def merge(left, right):
    result = []
    i = j = 0
    while i < len(left) and j < len(right):
        if left[i] < right[j]:
            result.append(left[i])
            i += 1
        else:
            result.append(right[j])
            j += 1
    result.extend(left[i:])
    result.extend(right[j:])
    return result

# Example usage:
data = [12, 11, 13, 5, 6, 7, 14, 15, 1, 2, 3]
sorted_data = natural_merge_sort(data)
print("Sorted array (Natural):", sorted_data)

Output:

Sorted array (Natural): [1, 2, 3, 5, 6, 7, 11, 12, 13, 14, 15]

3. Parallel Merge Sort

With the rise of multicore processors, parallel merge sort can leverage multiple threads to improve performance. The array is split and sorted concurrently across multiple cores, and the sorted sub-arrays are then merged in parallel. This can drastically reduce the sorting time for vast datasets.

4. In-Place Merge Sort

Standard merge sort requires O(n) extra space for merging, but in-place merge sort reduces this space complexity to O(1) by sorting the array in place. The implementation is more complex and may not perform as efficiently as the standard merge sort, but it conserves memory.

In-Place Merge Sort Code:

def merge_in_place(arr, start, mid, end):
    start2 = mid + 1
    # If already sorted, return
    if arr[mid] <= arr[start2]:
        return
    while start <= mid and start2 <= end:
        if arr[start] <= arr[start2]:
            start += 1
        else:
            value = arr[start2]
            index = start2
            while index != start:
                arr[index] = arr[index - 1]
                index -= 1
            arr[start] = value

            start += 1
            mid += 1
            start2 += 1

def in_place_merge_sort(arr, l, r):
    if l < r:
        mid = l + (r - l) // 2
        in_place_merge_sort(arr, l, mid)
        in_place_merge_sort(arr, mid + 1, r)
        merge_in_place(arr, l, mid, r)

# Example usage:
data = [12, 11, 13, 5, 6, 7]
in_place_merge_sort(data, 0, len(data) - 1)
print("Sorted array (In-Place):", data)

Output:

Sorted array (In-Place): [5, 6, 7, 11, 12, 13]

Limitations of Merge Sort

While merge sort is highly efficient, it does have certain limitations:

Requires Additional Memory: Merge sort needs O(n) additional space for auxiliary arrays. This can be an issue for large datasets when in-place sorting is necessary.
Not Ideal for Small Arrays: For small datasets, simpler algorithms like insertion sort can outperform merge sort due to lower overhead.
Slower in Practice: Even though merge sort has a better time complexity than algorithms like bubble sort, in practice, quick sort often outperforms it due to lower constant factors.

How Merge Sort is Used in TimSort

Merge Sort is a crucial component of TimSort, handling the merging phase after Insertion Sort is applied to small runs. TimSort leverages Merge Sort to efficiently combine these sorted runs into a single, fully sorted array. Here’s how Merge Sort fits into TimSort:

Merging Sorted Runs: After Insertion Sort is used to sort small segments of the data (runs), TimSort merges these runs using the classical Merge Sort technique. Since the runs are already sorted, Merge Sort can efficiently combine them without re-sorting individual elements, maintaining TimSort’s stability and adaptability for partially sorted data.
Stable Sorting: One of the main benefits of using Merge Sort in TimSort is its stability—it maintains the relative order of equal elements. This is essential for real-world datasets where the order of similar items is often significant, such as sorting by multiple criteria.
Divide and Conquer: TimSort leverages the divide-and-conquer nature of Merge Sort. Once Insertion Sort has sorted the smaller runs, Merge Sort recursively combines them into larger runs. This merging process happens in linear time, ensuring that the overall time complexity remains O(n log n).

In essence, Merge Sort allows TimSort to efficiently combine multiple sorted sections into a final, fully sorted array while retaining stability and minimizing additional operations. This combination of Insertion Sort and Merge Sort makes TimSort particularly well-suited for handling large, real-world datasets.

Conclusion

Merge sort is a powerful, stable, and reliable sorting algorithm with consistent O(n log n) performance. In this post, we explored various implementations of merge sort, including bottom-up, parallel, in-place, and natural merge sort, each suited to different scenarios. Merge sort is a great choice for sorting large datasets, especially when stability and guaranteed performance are needed. However, be mindful of the memory overhead and consider using other variations if necessary.

Congratulations on reading to the end of this tutorial!

Ready to see sorting algorithms come to life? Visit the Sorting Algorithm Visualizer on GitHub to explore animations and code.

To implement Merge Sort in C++, go to the article How To Do Merge Sort in C++.

To implement Merge Sort in Rust, go to the article How To Do Merge Sort in Rust.

To implement Merge Sort in Java, go to the article How To Do Merge Sort in Java.

For further reading on sorting algorithms in Python, go to the articles:

Go to the online courses page on Python to learn more about Python for data science and machine learning.

Have fun and happy researching!

Suf

Senior Advisor, Data Science | [email protected] | + posts

Suf is a senior advisor in data science with deep expertise in Natural Language Processing, Complex Networks, and Anomaly Detection. Formerly a postdoctoral research fellow, he applied advanced physics techniques to tackle real-world, data-heavy industry challenges. Before that, he was a particle physicist at the ATLAS Experiment of the Large Hadron Collider. Now, he’s focused on bringing more fun and curiosity to the world of science and research online.

Buy Me a Coffee