Imagine if you have been tasked to sort a list of people based on some criteria or priority; how would you go about it? Doing so manually without a proper approach can take you a lot of time and might not even result in accurate sorting. Now take that list of people to be a massive dataset of numbers and values. Instinctively, you will not manually do the dirty work but seek a visual data structure where you can study the criteria/priority relationship between data points. Binary heaps, which are tree-like data structures, are the most appropriate data structures to accomplish this. Wondering what it would look like? Read to know more about this interesting data structure, including what is a binary heap and how it represents relationships between a node and its (at most) two children.
Table of contents
- What is a Binary Heap?
- How is Binary Heap Represented?
- Binary Heap Properties
- Types of Binary Heaps
- Binary Heap Operations
- Implementing Binary Heap
- Time Complexity of Binary Heap Operations
- Applications of Binary Heap
- Binary Heap vs Binary Search Tree
- Similarities Between Binary Search Trees and Binary Heaps
- Real-world Examples of Using Binary Heaps
- Advanced Topics in Binary Heap
- Final Word
- Frequently Asked Questions
What is a Binary Heap?
A binary heap is a data structure used to store and visualize data as a binary tree. The word “binary” here implies that each node (other than the root/starting node) of this tree has at most two children. Secondly, the key (or value) associated with each node ‘x’ should be greater than or equal to the keys (or values) of the children.
Here’s an example to help you understand better. Imagine a hospital’s emergency room and a queue of patients. The priority queue can be represented as a binary heap in the following manner.
- Patient with the most urgent health requirement (highest priority) is placed at the front (root of the tree).
- Other patients (nodes) are arranged in a way that a patient’s (node’s) condition is more critical or at least equivalent to that of the succeeding patients (children of the node).
How is Binary Heap Represented?
Binary heaps are represented using arrays. Arrays are containers that store elements in a specific, pre-determined order that satisfies the property. For instance, an ascending array will store values in an ascending (increasing) order.
As heap elements are stored in an array, you’ll need an index. The root or the starting value is stored at the 1st position. For any other element ‘i’ in the array,
- The left child is at 2*i th position.
- The position of the parent node of i th element is at ⌊1/2⌋.
- The right child is at 2*i +1 th position.
Here, let’s say that you want to compute the position of G. As per the array, G is at the 5th position. To know its child’s position, use 2*i. That’ll give you J as the 10th element.
Binary Heap Properties
A binary heap in data structure has certain properties.
Shape Property (or Complete Binary Tree Property)
Binary heaps should be “complete” to satisfy the shape property. Being complete implies that all levels (nodes) of the tree are filled, leaving the last one aside, which is filled from left to right.
Heap Order Property (Binary Min Heap and Binary Max Heap)
The heap order property states that binary heaps could be of two primary types: max and min. It ensures that the highest (or lowest) priority element is always at the root of the heap, making it efficient for priority queue operations.
Types of Binary Heaps
There are two types of binary heaps: a min-heap and a max-heap.
Min-Heap
In a minimum binary heap structure, for every node ‘x’ (besides the root), the key (value) stored in children is less than or equal to the key (values) of x. In other words, the minimum element is always at the root, and the value of each parent node is less than or equal to the values of its children.
As you can see in the image above, the key (value) of the root is less than the keys of its children.
Max-Heap
Contrary to a min-heap, for every node ‘x’ (besides the root), the key (value) stored in it should be less than or equal to the keys (values) stored in its children. Simply put, the maximum element is always at the root, and the value of each parent node is greater than or equal to the values of its children.
As you can see here, the key (value) of the root is greater than the keys of its children.
Binary Min Heap vs Binary Max Heap
Crtieria | Binary Min Heap | Binary Max Heap |
Definition | For any node ‘x,’ the key (value) in ‘x’ is less than or equal to the keys (values) in its children. | For any node ‘x,’ the key (value) in ‘x’ is greater than or equal to the keys (values) in its children. |
Root Key (or Value) | Minimum | Maximum |
Sorting | Ascending | Descending |
Insertion | A new element to be inserted is placed at the appropriate position to maintain the min-heap property. | The new element is placed at the appropriate position without disturbing the max-heap property. |
Deletion | The minimum value (root value) is removed and the last element in the heap replaces it. The entire heap is then accommodated to satisfy the min-heap property. | The maximum value (root value) is deleted. The last element of the heap replaces it, and the heap is finally adjusted to satisfy the max-heap property. |
Application | Used where the minimum is to be assessed. | Used where the maximum is to be assessed. |
Binary Heap Operations
Binary heaps are widely used in sorting-based applications, assessing the minimum/maximum and prioritization. As a result, there is a whole set of operations that you can do with binary heap arrays. The most commonly used heap operations are mentioned below.
getMin() and getMax()
The getMin() operation returns the minimum value (root element) of a binary min heap. Contrarily, getMax() returns the maximum value (root) of a binary max heap.
Insertion via insert()
You can insert new elements using the insert() operation. The new element is generally inserted at the bottom right of the heap to maintain the complete tree property. The order property is then restored by appropriate swapping after comparing the newly inserted value with its parent.
Deletion via delete()
This operation deletes the root (min in min-heap and max in max-heap). Then the last element of the heap is moved to the root position, and the entire heap is adjusted to satisfy the required order property. The adjustment is done by comparing the new root value with its children’s value.
extractMin() and extractMax()
The extractMin() command is used to remove the minimum element (root) from a binary min heap. Contrarily, the extractMax() command is used to remove the maximum (root) from a binary max heap.
Implementing Binary Heap
Here are some of the common ways to implement Binary Heap:
Array Representation
The most efficient way to represent binary heaps is via arrays. For a binary heap represented as an array, the following properties hold:
- Parent-child Relationship
- For any element at position i (with the root at 1), its left child is at 2*i and the right is at 2*i +1.
- Conversely, for any element at index j (j > 0) in the array, its parent is located at index floor((j-1)/2), where floor() is the floor division operator.
- Heap Order
- Min-Heap: the value of each parent node is less than or equal to the values of its children nodes.
- Max-Heap: the value of each parent node is greater than or equal to the values of its children nodes.
Heapify Algorithm
If you have an array and wish to convert it into a valid binary heap, you have to use heapify algorithms. It is an essential step in building a heap or restoring the heap property after an operation like insertion or deletion. There are two primary variations of heapify algorithms (also called heap sort): sift-down and sift-up.
- Sift-Down (Bubble-down): It is used when the heap property gets violated at an index (usually a root of any subtree). To ensure that the subtree satisfies the heap order property, sift-down algorithms start at a random node and move the value down the tree by comparing it with subsequent children’s smaller values. The procedure keeps on going till a position is attained where the subtree root is less than both its children.
Pseudocode for Sift-Down Heapify
- Sift-Up (Bubble-up): It is used when a new element is added to the heap at the last position and ensures that the newly inserted element is moved up to its correct position to satisfy the heap property. This algorithm swaps a node (too large in value) with its parent (hence moving it up). The process carries on till the value is no longer larger than the parent above.
Pseudocode for Sift-Up Algorithm
Insertion and Deletion Operations
Insertion
- Add the new element to the bottom-rightmost position.
- Compare its value to the parent node. If it’s a max heap and the new element > parent, or if the heap is a min heap and the new element < parent, swap the new element with its parent—swap the elements.
- Repeat the above step till the heap property gets satisfied.
Illustration
Max Heap: [9,8,6,7,5,4,2,3,1]Step 1: Add a new element, say 14. [9, 8, 6, 7, 5, 4, 2, 3, 1, 14]Step 2: Compare 14 with the parents (1), and swap if necessary. [9, 8, 6, 7, 5, 4, 2, 3, 14, 1]Step 3: Keep repeating till the property is satisfied. The output here will be [14, 9, 8, 6, 7, 5, 4, 2, 3, 15, 1] |
Deletion
- Remove the last element and place it at the root.
- Compare this new root with child nodes.
- If the heap is a max heap, swap the root with the greater child if the child>root. If the heap is a min-heap, swap the root with its smaller child if the child is smaller than the root.
- Repeat the above step till the property is restored.
Illustration
Original Max Heap: [9, 8, 6, 7, 5, 4, 2, 3, 1]Step 1: Replace the root (9) with the last element (2) [2, 8, 6, 7, 5, 4, 2, 3, 1]Step 2: Remove the last element (2) [2, 8, 6, 7, 5, 4, 2, 3]Step 3: Compare the new root (2) with its larger child (8) and swap if necessary [8, 2, 6, 7, 5, 4, 2, 3]Repeat till the heap property is restored and you get: [8, 6, 7, 5, 4, 2, 3, 1] |
Time Complexity of Binary Heap Operations
In this section; we’ll explore the time complexity of binary heap operations. Different operations take different durations of time to be performed.
Analysis of Insertion and Deletion Operations
- Insertion: This operation adds a new element to the binary heap and sifts up the element to maintain heap order property.
- Time Complexity of Insertion: The best case would be if the inserted element is already satisfying the heap order property. In this case, the time complexity would be O(1).
On average, the time complexity is O(log n), where n is the number of elements in the heap. This is because we’ll keep checking till the parent-child values satisfy the condition, and this could take as many as log n checks and shifts.
However, in case an element is to be sifted up the entire binary heap, then the duration will be proportional to (log n).
- Deletion: Deletion involves removing the root element and replacing it with the last element in a heap, followed by sift-down (bubble-down) to maintain the heap property.
- Time Complexity of Deletion: The best case would be if the inserted element is already satisfying the heap order property. In this case, the time complexity would be O(1).
Generally, it would be O(log n), where n is the number of elements in the heap. The complexity starts from the best case at 1 and then increases at 1,2,3,4… till the max complexity of log n is attained.
However, in cases where the elements are sifted down the entire length of the heap, the time complexity would be proportional to log n.
Analysis of Heapify Operations
Unlink insertion and deletion, heapify operations sift down every node except the leaves. Simply put, the operation does not necessarily start from the root. Let’s calculate the time complexity of the sift-down heapify operation to have a better understanding.
- If you’re starting at level zero, the time complexity would be O(0), as there are no child nodes.
- At a level above (level 1), the complexity would be O(1) as this level has only a level of children to shift down to.
- Similarly, as you go up, the complexity increases by 1 until it reaches O(log n).
- For each of the levels, the number of nodes reduces exponentially with a power of 2. So if level 0 has n/2 nodes, level 1 would have n/4, and so on.
- As a result, the number of checks and shifts would be
This can be simplified to:
- As we are only talking about an approximation, this can be substituted by the asymptotic result—O(N).
Applications of Binary Heap
Binary heaps hold significant importance as they make the implementation of sorting and prioritizing algorithms more efficient. Here are some standard applications.
Priority Queues
One of the most widely used applications of binary heaps is in priority queues— queues where elements have associated priorities and need to be processed based on their priority. These queues are used to schedule tasks, event simulations, graph algorithms like Dijkstra and Prim’s, etc.
Heapsort Algorithm
Heap sort is an efficient sorting algorithm that utilizes binary heaps. This algorithm constructs a max-heap (ascending), and min-heap (descending) using the input array and then extracts the max (min), respectively. The final output is a sorted array. Heap is often used in scenarios where in-place sorting with a guaranteed worst-case performance (where elements are to be sifted through the entire length of the heap) is required.
Binary Heap vs Binary Search Tree
By now, you must have learned what a binary heap is. Let’s talk about a binary tree—a tree data structure with at most 2 children at each node. A binary heap and a binary tree might sound near-similar, but there are some differences.
Criteria | Binary Heap | Binary Search Tree |
Structure | A binary tree with heap and ordering properties. | A tree data structure with at most 2 children at each node. |
Order | May not be ordered/partially ordered. | Completely ordered. |
Duplication | Allows duplicates | Does not allow duplicates. |
Insertion or Deletion | O(log n) | O(n) |
Operations | Efficient for insertion, deletion, and retrieval of the minimum (or maximum) element | Efficient for searching, insertion, deletion, and traversal |
Common Applications | Priority queues, heap sort, event-driven simulations | Efficient searching, ordered data storage, dictionary, symbol tables |
Similarities Between Binary Search Trees and Binary Heaps
- Both, binary heaps and search tress follow a hierarchical structure, with nodes and leaves.
- Each node in both can have at most two children
- In both binary heaps and binary trees, each node (except for the root) has a parent node.
- In a binary tree, each node can have at most two child nodes (left child and right child). On the other hand, the parent-child relationship is defined based on the position of nodes in the array representation in a binary heap.
- Both can be defined recursively.
- Each subtree in a heap or a tree is itself a heap or a tree, respectively.
- Binary heaps and binary trees can be traversed using similar traversal algorithms, such as inorder, preorder, and postorder traversals.
Real-world Examples of Using Binary Heaps
Binary heaps are used in ample ways in the real world. Here are some examples.
- Job Scheduling: Binary heaps can schedule tasks/jobs based on their deadlines. Jobs with an earlier deadline are prioritized, ensuring they’re executed first. Each job is assigned a priority, and a max heap is used to keep track of the highest priority job. The heap is continuously updated as more jobs arrive and the existing ones get completed.
- Memory Management: In memory management, binary heaps are used to allocate and deallocate storage units dynamically. Operations like malloc() and free() are used in heaps to maintain a record of memory blocks still available.
- Dijkstra’s Algorithm: Dijkstra’s is a well-known traversal algorithm for finding the shortest distance between nodes. Heap data structures are widely implemented to create priority queues for Dijkstra’s algorithm. The algorithm utilizes a min heap to efficiently extract the node with the smallest distance from a source node.
- Operating System Process Scheduling: The techniques used by operating systems for scheduling processes often use binary heaps. Processes are prioritized, and the process with the highest priority is tracked using a min-heap. The scheduler chooses and starts the process at the top of the heap, and the heap is updated as necessary as tasks are completed, or new ones are added.
- Median Computation: Binary heaps are used to find the median. By analyzing two heaps, one for elements with lesser values than the current median and another for greater values.
- Network Routing: Network routing algorithms, like Routing Information Protocol (RIP), also utilize binary heaps. This is done to maintain a list of all available routes and then choose one based on the cost or distance.
Advanced Topics in Binary Heap
D-ary Heap
D-ary heaps are an advanced variation of binary heaps where each internal node can have up to ‘D’ children instead of only (or at most) two. They offer better cache performance and reduced tree height compared to binary heaps, especially for large D values.
Fibonacci Heap
Fibonacci heaps are a kind of heap data structure that is more efficient and offers more advanced operations like decreasing a key or merging keys. They have significantly reduced the constant time complexity for these operations, making them suitable for certain algorithms like Dijkstra’s algorithm.
Final Word
We’re sure that you must have learned quite a bit about binary heaps and their vitality in sorting algorithms, as well as in the real world. These robust data structures are highly efficient in managing, sorting, and prioritizing data. Whether you’re using a priority queue or arranging elements in an ascending/descending order— binary heaps provide a scalable solution. To explore more intricacies of these data structures and learn more about other similar ones, Analytics Vidhya is an excellent choice. Analytics Vidhya is a comprehensive online platform that offers verified educational content, tutorials, and articles, and courses on various topics related to data science, algorithms, and programming. Not only this, with courses like the AI and ML Blackbelt Program, AV teaches you how these modern-day technologies are aiding standard data operations. So without further ado, head over to the website.
Frequently Asked Questions
A. A binary heap is a hierarchical, tree data structure, whereas a stack is a linear data structure that provides static memory allocation for temporary variables.
A. Building a binary heap from an array of n elements takes O(n) time complexity.
A. They are stored as arrays, where parent-child relationships are calculated using index calculations.
A. Heap sorting can be done at banks. Those who are only there to withdraw or deposit via machines can go in first as they will take lesser time. People who have a more time significant requirement can be sent in later.