How to Do Heap Sort in Python

by | DSA, Programming, Python, Tips

Heap sort is an efficient, comparison-based sorting algorithm that relies on a binary heap data structure to sort elements in place. It’s known for its time complexity of O(n log n), making it a reliable choice for large datasets. This blog post will explain how heap sort works, provide a Python implementation, and compare it to other sorting algorithms like quick sort and merge sort.


What is Heap Sort?

Heap sort works by first building a max heap from the input data and then repeatedly extracting the maximum element (the root of the heap) to place it at the end of the array. This process ensures that the array is sorted in increasing order. The heap is continuously “reheapified” after each extraction, maintaining its structure.

Heap sort uses a max heap to sort the array in ascending order.

How Heap Sort Works:

  1. Build a max heap from the input data.
  2. Swap the root of the max heap (the largest value) with the last element of the heap.
  3. Reduce the heap size by one and heapify the root.
  4. Repeat the process until the heap is empty and the array is sorted.

Time Complexity:

  • Best Case: O(n log n)
  • Worst Case: O(n log n)
  • Average Case: O(n log n)

Space Complexity:

  • O(1) (heap sort is an in-place sorting algorithm, meaning it doesn’t require additional space).

Heap Sort Algorithm in Python

Here’s how you can implement heap sort in Python:

# Helper function to maintain the max heap property
def heapify(arr, n, i):
    largest = i  # Initialize the largest as root
    left = 2 * i + 1  # Left child index
    right = 2 * i + 2  # Right child index

    # If left child exists and is greater than root
    if left < n and arr[left] > arr[largest]:
        largest = left

    # If right child exists and is greater than the largest so far
    if right < n and arr[right] > arr[largest]:
        largest = right

    # If the largest element is not the root, swap and heapify again
    if largest != i:
        arr[i], arr[largest] = arr[largest], arr[i]
        heapify(arr, n, largest)

# Main function to perform heap sort
def heap_sort(arr):
    n = len(arr)

    # Step 1: Build a max heap
    for i in range(n // 2 - 1, -1, -1):
        heapify(arr, n, i)

    # Step 2: Extract elements one by one from the heap
    for i in range(n - 1, 0, -1):
        # Move the current root to the end
        arr[i], arr[0] = arr[0], arr[i]
        # Call heapify on the reduced heap
        heapify(arr, i, 0)

# Example usage:
data = [12, 11, 13, 5, 6, 7]
heap_sort(data)
print("Sorted array:", data)

Output:

Sorted array: [5, 6, 7, 11, 12, 13]

Step-by-Step Explanation of Heap Sort

Let’s break down the heap sort process step by step for the array [12, 11, 13, 5, 6, 7].

Step 1: Build a Max Heap

  • The first step in heap sort is to build a max heap from the input array. A max heap ensures that the largest element is always at the root of the heap.

Max heap after heapify:

[13, 11, 12, 5, 6, 7]

Explanation: In this heap, the root element (13) is the largest, and each parent node is larger than its children.

Step 2: Extract Elements from the Heap

  • Once the max heap is built, we repeatedly extract the maximum element (the root) by swapping it with the last element in the array. We then reheapify the heap to maintain the heap property.

For example:

  • Swap the root (13) with the last element (7).
  • Heapify the remaining elements.

Array after first extraction:

[12, 11, 7, 5, 6, 13]  (13 is in its correct position)

Step 3: Repeat Until Sorted

  • Repeat the extraction and heapification process until the array is fully sorted.

Final sorted array:

[5, 6, 7, 11, 12, 13]

Comparison to Other Sorting Algorithms

1. Heap Sort vs Quicksort

  • Time Complexity: Both quicksort and heap sort have an average time complexity of O(n log n). However, quicksort has a worst-case time complexity of O(n²) if the pivot selection is poor (though this can be mitigated with good pivot selection strategies like randomization or median-of-three).
  • Stability: Quicksort is not stable (unless modified), and heap sort is also not stable. Stability means that equal elements retain their relative order.
  • Space Complexity: Heap sort has a space complexity of O(1), making it an in-place sort. Quicksort has a space complexity of O(log n) due to its recursive calls.

2. Heap Sort vs Merge Sort

  • Time Complexity: Both merge sort and heap sort have a worst-case time complexity of O(n log n). However, merge sort requires additional space (O(n)) for its auxiliary array, while heap sort is an in-place algorithm with O(1) space complexity.
  • Stability: Merge sort is stable, meaning that equal elements retain their original relative order. Heap sort is not stable.
  • Application: Merge sort is typically preferred for sorting linked lists or datasets that require stability, while heap sort is ideal when memory usage is a concern.

3. Heap Sort vs Insertion Sort

  • Time Complexity: Insertion sort has a time complexity of O(n²) in the average and worst case, while heap sort maintains O(n log n). Insertion sort, however, performs better on small datasets or nearly sorted arrays with O(n) best-case complexity.
  • Space Complexity: Both heap sort and insertion sort are in-place algorithms with O(1) space complexity.

Advantages of Heap Sort

  1. In-Place Sorting: Heap sort requires only O(1) extra space, making it memory-efficient compared to algorithms like merge sort.
  2. Guaranteed O(n log n) Performance: Heap sort guarantees O(n log n) time complexity in both the average and worst cases, unlike quicksort, which can degrade to O(n²) in the worst case.
  3. Efficient for Large Datasets: Because heap sort uses a binary heap, it can efficiently handle large datasets with relatively low memory overhead.

Limitations of Heap Sort

  1. Not Stable: Heap sort is not a stable sorting algorithm. This means that if two elements are equal, their relative order in the original array might not be preserved in the sorted array.
  2. Slower in Practice Compared to Quicksort: Although heap sort has a guaranteed time complexity of O(n log n), quicksort typically outperforms it in practice due to lower constant factors in quicksort’s recursive calls and cache efficiency.
  3. Not Adaptive: Unlike insertion sort, heap sort does not perform better on nearly sorted data. Its time complexity remains O(n log n) regardless of the initial ordering of the elements.

Conclusion

Heap sort is a robust sorting algorithm with consistent time complexity across all cases, making it a good choice for large datasets. While it may not be as fast in practice as quick sort or space-efficient as merge sort for specific tasks, heap sort’s in-place sorting, and O(n log n) performance make it a valuable tool when memory is limited.

Congratulations on reading to the end of this tutorial!

For further reading on sorting algorithms in Python, go to the articles:

Have fun and happy researching!

Profile Picture
Research Scientist at Moogsoft | + posts

Suf is a research scientist at Moogsoft, specializing in Natural Language Processing and Complex Networks. Previously he was a Postdoctoral Research Fellow in Data Science working on adaptations of cutting-edge physics analysis techniques to data-intensive problems in industry. In another life, he was an experimental particle physicist working on the ATLAS Experiment of the Large Hadron Collider. His passion is to share his experience as an academic moving into industry while continuing to pursue research. Find out more about the creator of the Research Scientist Pod here and sign up to the mailing list here!