Balanced merge is a type of sorting algorithm where data is stored on multiple tapes (or other forms of storage). The tapes are divided into two groups: odd-numbered tapes and even-numbered tapes. The data on each tape is initially sorted, and the algorithm’s goal is to merge the data from all of the tapes into a single sorted output.
Approach: Here’s an approach for the above algorithm:
The algorithm works by repeatedly making passes over the data, merging pairs of tapes at a time. On each pass, the tapes are divided into pairs, and each pair’s data is merged into a temporary output tape. After all, pairs have been processed, the data on the temporary output tapes are copied back to the original tapes, and the process is repeated with a new set of tape pairs until all of the data is on a single tape.
Illustration: Here is an illustration with the approach used:
Suppose we have the following list of data that we want to sort: data[] = [5, 2, 4, 6, 1, 3]
Here are the steps involved in the sorting process:
- Step 1: Initialize the tapes and divide the data among them. We create three empty tapes and divide the data among them like this: tapes = [[], [], []]
for i, x in enumerate(data):
tapes[i % 3].append(x)
# tapes is now [[5, 4, 1], [2, 6, 3], []]
- Step 2: Sort the data on each tape using an internal sorting technique. We can use the built-in sort function to sort the data on each:
for tape in tapes:
tape.sort()
# tapes is now [[1, 4, 5], [2, 3, 6], []]
- Step 3: Repeatedly make passes over the tapes, merging pairs of tapes at a time. On each pass, we divide the tapes into pairs and merge the data from each pair into a temporary output tape using the merge function:
while len(tapes) > 1:
new_tapes = []
for i in range(0, len(tapes), 2):
tape1 = tapes[i]
tape2 = tapes[i+1] if i+1 < len(tapes) else None
new_tapes.append(merge(tape1, tape2))
tapes = new_tapes
# After the first pass, tapes is [[1, 2, 3, 4, 5, 6]]
- Step 4: Copy the data from the temporary output tapes back to the original tapes. Since we only have one tape at this point, we can skip this step.
- Step 5: Repeat the process until all of the data is on a single tape. Since we only have one tape at this point, the sorting process is complete.
The final result is a sorted version of the original data:
sorted_data = tapes[0]
# sorted_data is [1, 2, 3, 4, 5, 6]
The steps involved in implementing the balanced merge sorting algorithm are:
- Initialize the tapes and divide the data among them.
- Sort the data on each tape using an internal sorting technique.
- Repeatedly make passes over the tapes, merging pairs of tapes at a time.
- Copy the data from the temporary output tapes back to the original tapes.
- Repeat the process until all of the data is on a single tape.
Here is the code that demonstrates this approach:
C++
// C++ code for the above approach #include <bits/stdc++.h> using namespace std; vector< int > merge(vector< int > tape1, vector< int > tape2) { if (tape2.size() == 0) { return tape1; } // Merge the data from tape1 and tape2 // into a temporary output tape vector< int > output_tape; int i = 0, j = 0; // Merge all the tapes left while (i < tape1.size() && j < tape2.size()) { if (tape1[i] < tape2[j]) { output_tape.push_back(tape1[i]); i++; } else { output_tape.push_back(tape2[j]); j++; } } output_tape.insert(output_tape.end(), tape1.begin() + i, tape1.end()); output_tape.insert(output_tape.end(), tape2.begin() + j, tape2.end()); return output_tape; } // Function for Balanced merge sort vector< int > balanced_merge_sort(vector< int > data, int num_tapes) { // Initialize the tapes vector<vector< int >> tapes(num_tapes); // Divide the data among the tapes for ( int i = 0; i < data.size(); i++) { tapes[i % num_tapes].push_back(data[i]); } // Sort each tape for ( int i = 0; i < num_tapes; i++) { sort(tapes[i].begin(), tapes[i].end()); } // Repeatedly make passes over the tapes, // merging pairs of tapes at a time while (tapes.size() > 1) { vector<vector< int >> new_tapes; for ( int i = 0; i < tapes.size(); i += 2) { vector< int > tape1 = tapes[i]; vector< int > tape2 = (i + 1 < tapes.size()) ? tapes[i + 1] : vector< int >(); new_tapes.push_back(merge(tape1, tape2)); } tapes = new_tapes; } // Return the final merged tape return tapes[0]; } // Driver code int main() { // Input vector< int > data = {5, 2, 4, 6, 1, 3}; // Function call vector< int > sorted_data = balanced_merge_sort(data, 3); for ( int i = 0; i < sorted_data.size(); i++) { cout << sorted_data[i] << " " ; } cout << endl; return 0; } // This code is contributed by lokeshpotta20. |
Java
// Java code for the above approach import java.io.*; import java.util.*; class GFG { public static List<Integer> merge(List<Integer> tape1, List<Integer> tape2) { if (tape2.size() == 0 ) { return tape1; } // Merge the data from tape1 and tape2 // into a temporary output tape List<Integer> outputTape = new ArrayList<>(); int i = 0 , j = 0 ; // Merge all the tapes left while (i < tape1.size() && j < tape2.size()) { if (tape1.get(i) < tape2.get(j)) { outputTape.add(tape1.get(i)); i++; } else { outputTape.add(tape2.get(j)); j++; } } outputTape.addAll(tape1.subList(i, tape1.size())); outputTape.addAll(tape2.subList(j, tape2.size())); return outputTape; } // Function for Balanced merge sort public static List<Integer> balancedMergeSort(List<Integer> data, int numTapes) { // Initialize the tapes List<List<Integer> > tapes = new ArrayList<>(); for ( int i = 0 ; i < numTapes; i++) tapes.add( new ArrayList<>()); // Divide the data among the tapes for ( int i = 0 ; i < data.size(); i++) { tapes.get(i % numTapes).add(data.get(i)); } // Sort each tape for ( int i = 0 ; i < numTapes; i++) { tapes.get(i).sort(Integer::compareTo); } // Repeatedly make passes over the tapes, // merging pairs of tapes at a time while (tapes.size() > 1 ) { List<List<Integer> > newTapes = new ArrayList<>(); for ( int i = 0 ; i < tapes.size(); i += 2 ) { List<Integer> tape1 = tapes.get(i); List<Integer> tape2 = (i + 1 < tapes.size()) ? tapes.get(i + 1 ) : new ArrayList<>(); newTapes.add(merge(tape1, tape2)); } tapes = newTapes; } // Return the final merged tape return tapes.get( 0 ); } public static void main(String[] args) { // Input List<Integer> data = List.of( 5 , 2 , 4 , 6 , 1 , 3 ); // Function call List<Integer> sortedData = balancedMergeSort(data, 3 ); for ( int i = 0 ; i < sortedData.size(); i++) { System.out.print(sortedData.get(i) + " " ); } System.out.println(); } } // This code is contributed by lokesh. |
Python3
# Python code for the above approach # Function for Balanced merge sort def balanced_merge_sort(data, num_tapes): # Initialize the tapes tapes = [[] for _ in range (num_tapes)] # Divide the data among the tapes for i, x in enumerate (data): tapes[i % num_tapes].append(x) # Sort each tape for tape in tapes: tape.sort() # Repeatedly make passes over the tapes, # merging pairs of tapes at a time while len (tapes) > 1 : new_tapes = [] for i in range ( 0 , len (tapes), 2 ): tape1 = tapes[i] tape2 = tapes[i + 1 ] if i + 1 < len (tapes) else None new_tapes.append(merge(tape1, tape2)) tapes = new_tapes # Return the final merged tape return tapes[ 0 ] def merge(tape1, tape2): if tape2 is None : return tape1 # Merge the data from tape1 and tape2 # into a temporary output tape output_tape = [] i = j = 0 # Merge all the tapes left while i < len (tape1) and j < len (tape2): if tape1[i] < tape2[j]: output_tape.append(tape1[i]) i + = 1 else : output_tape.append(tape2[j]) j + = 1 output_tape.extend(tape1[i:]) output_tape.extend(tape2[j:]) return output_tape # Driver code # Input data = [ 5 , 2 , 4 , 6 , 1 , 3 ] # Function call sorted_data = balanced_merge_sort(data, 3 ) print (sorted_data) |
C#
// C# code for the above approach using System; using System.Collections.Generic; public class GFG { public static List< int > merge(List< int > tape1, List< int > tape2) { if (tape2.Count == 0) { return tape1; } // Merge the data from tape1 and tape2 // into a temporary output tape List< int > outputTape = new List< int >(); int i = 0, j = 0; // Merge all the tapes left while (i < tape1.Count && j < tape2.Count) { if (tape1[i] < tape2[j]) { outputTape.Add(tape1[i]); i++; } else { outputTape.Add(tape2[j]); j++; } } outputTape.AddRange( tape1.GetRange(i, tape1.Count - i)); outputTape.AddRange( tape2.GetRange(j, tape2.Count - j)); return outputTape; } // Function for Balanced merge sort public static List< int > balancedMergeSort(List< int > data, int numTapes) { // Initialize the tapes List<List< int > > tapes = new List<List< int > >(); for ( int i = 0; i < numTapes; i++) tapes.Add( new List< int >()); // Divide the data among the tapes for ( int i = 0; i < data.Count; i++) { tapes[i % numTapes].Add(data[i]); } // Sort each tape for ( int i = 0; i < numTapes; i++) { tapes[i].Sort(); } // Repeatedly make passes over the tapes, // merging pairs of tapes at a time while (tapes.Count > 1) { List<List< int > > newTapes = new List<List< int > >(); for ( int i = 0; i < tapes.Count; i += 2) { List< int > tape1 = tapes[i]; List< int > tape2 = (i + 1 < tapes.Count) ? tapes[i + 1] : new List< int >(); newTapes.Add(merge(tape1, tape2)); } tapes = newTapes; } // Return the final merged tape return tapes[0]; } static public void Main() { // Input List< int > data = new List< int >{ 5, 2, 4, 6, 1, 3 }; // Function call List< int > sortedData = balancedMergeSort(data, 3); for ( int i = 0; i < sortedData.Count; i++) { Console.Write(sortedData[i] + " " ); } Console.WriteLine(); } } // This code is contributed by lokeshmvs21. |
Javascript
// JavaScript code for the above approach // Function to merge two sorted tapes function merge(tape1, tape2) { // If one of the tapes is empty, // return the other tape if (tape2.length == 0) { return tape1; } // Merge the data from tape1 and tape2 // into a temporary output tape let output_tape = []; let i = 0, j = 0; // Merge all the tapes left while (i < tape1.length && j < tape2.length) { if (tape1[i] < tape2[j]) { output_tape.push(tape1[i]); i++; } else { output_tape.push(tape2[j]); j++; } } output_tape.push(...tape1.slice(i)); output_tape.push(...tape2.slice(j)); return output_tape; } // Function for Balanced merge sort function balanced_merge_sort(data, num_tapes) { // Initialize the tapes let tapes = new Array(num_tapes); for (let i=0; i<num_tapes; i++) { tapes[i] = []; } // Divide the data among the tapes for (let i = 0; i < data.length; i++) { tapes[i % num_tapes].push(data[i]); } // Sort each tape for (let i = 0; i < num_tapes; i++) { tapes[i].sort( function (a, b) { return a - b; }); } // Repeatedly make passes over the tapes, // merging pairs of tapes at a time while (tapes.length > 1) { let new_tapes = []; for (let i = 0; i < tapes.length; i += 2) { let tape1 = tapes[i]; let tape2 = (i + 1 < tapes.length) ? tapes[i + 1] : []; new_tapes.push(merge(tape1, tape2)); } tapes = new_tapes; } // Return the final merged tape return tapes[0]; } // Driver code let data = [5, 2, 4, 6, 1, 3]; // Input let sorted_data = balanced_merge_sort(data, 3); console.log(sorted_data); // Output |
[1, 2, 3, 4, 5, 6]
Time Complexity: O(n log n)
Auxiliary Space: O(n)
Advantages of Balanced Merge:
- Can take advantage of parallelism to speed up the sorting process
- The time complexity of O(n log n), which is generally considered to be efficient for large amounts of data
- Can handle large amounts of data
Disadvantages of Balanced Merge:
- Requires additional space to store the data on the tapes and the temporary output tapes during the merge process
- May not be the most efficient choice for smaller datasets
Benefits of using this over other sorting techniques:
- Good choice for sorting large amounts of data when multiple tapes (or other forms of sequential storage) are available
- Can be used to take advantage of parallelism to speed up the sorting process
- Maybe a good choice when you need to sort data in an external storage device (such as a tape drive or hard drive)
Ready to dive in? Explore our Free Demo Content and join our DSA course, trusted by over 100,000 neveropen!