Sometimes, while working with Tuple Matrix, we can have a problem in which we get lots of data, which are similar, i.e elements are same in rows, just the ordering of tuples is different, it’s sometimes, desired to get them removed. This kind of problem can have application in domains such as web development and day-day programming. Let’s discuss certain ways in which this task can be performed.
Input : test_list = [[(4, 5), (3, 2)], [(3, 2), (4, 5)], [(3, 2), (4, 5)]]
Output : {((4, 5), (3, 2))}Input : test_list = [[(4, 6), (3, 2)], [(3, 2), (4, 5)], [(3, 2), (4, 5)]]
Output : {((3, 2), (4, 6)), ((4, 5), (3, 2))}
Method #1: Using set() + tuple() + sorted() + list comprehension The combination of above functions can be used to solve this problem. In this, we first, perform the sorting, and then convert the rows to set, which automatically removes the duplicates.
Python3
# Python3 code to demonstrate working of # Remove Similar Rows from Tuple Matrix # Using set() + tuple() + sorted() + list comprehension # initializing lists test_list = [[( 4 , 5 ), ( 3 , 2 )], [( 2 , 2 ), ( 4 , 6 )], [( 3 , 2 ), ( 4 , 5 )]] # printing original list print ("The original list is : " + str (test_list)) # Remove Similar Rows from Tuple Matrix # Using set() + tuple() + sorted() + list comprehension res = set ([ tuple ( set (sub)) for sub in test_list]) # printing result print (" Tuple matrix after removal : " + str (res)) |
The original list is : [[(4, 5), (3, 2)], [(2, 2), (4, 6)], [(3, 2), (4, 5)]] Tuple matrix after removal : {((4, 6), (2, 2)), ((4, 5), (3, 2))}
Time Complexity: O(n), where n is the length of the list test_dict
Auxiliary Space: O(n) additional space of size n is created where n is the number of elements in the res list
Method #1: Using set() + tuple() + sorted() + list comprehension The combination of above functions can be used to solve this problem. In this, we perform the task of sorting and tuple conversion using frozenset().
Python3
# Python3 code to demonstrate working of # Remove Similar Rows from Tuple Matrix # Using set() + frozenset() + generator expression # initializing lists test_list = [[( 4 , 5 ), ( 3 , 2 )], [( 2 , 2 ), ( 4 , 6 )], [( 3 , 2 ), ( 4 , 5 )]] # printing original list print ("The original list is : " + str (test_list)) # Remove Similar Rows from Tuple Matrix # Using set() + frozenset() + generator expression res = set ([ frozenset (sub) for sub in test_list]) # printing result print (" Tuple matrix after removal : " + str (res)) |
The original list is : [[(4, 5), (3, 2)], [(2, 2), (4, 6)], [(3, 2), (4, 5)]] Tuple matrix after removal : {frozenset({(4, 5), (3, 2)}), frozenset({(4, 6), (2, 2)})}
Method #3 : Using dict + tuple() + list
- Convert each sublist in the matrix to a tuple so that we can use it as a dictionary key later.
- Create an empty dictionary to store the unique tuples.
- Loop through each tuple in the matrix and check if it exists in the dictionary.
- If it doesn’t exist, add it to the dictionary.
- Convert the dictionary values back to tuples and return the set of unique tuples.
Python3
def remove_similar_rows(matrix): # convert each sublist in the matrix to a tuple tuples = [ tuple (row) for row in matrix] # create an empty dictionary seen = {} # iterate over the tuples for tup in tuples: # use the sorted tuple as a key to avoid issues with tuple order key = tuple ( sorted (tup)) # add the tuple to the dictionary if it hasn't been seen before if key not in seen: seen[key] = tup # return the values (i.e. unique tuples) of the dictionary as a list return list (seen.values()) # test the function with the same input matrices as before test_list = [[( 4 , 5 ), ( 3 , 2 )], [( 3 , 2 ), ( 4 , 5 )], [( 3 , 2 ), ( 4 , 5 )]] print (remove_similar_rows(test_list)) # output: [((4, 5), (3, 2))] test_list = [[( 4 , 6 ), ( 3 , 2 )], [( 3 , 2 ), ( 4 , 5 )], [( 3 , 2 ), ( 4 , 5 )]] print (remove_similar_rows(test_list)) # output: [((3, 2), (4, 5)), ((4, 6), (3, 2))] |
[((4, 5), (3, 2))] [((4, 6), (3, 2)), ((3, 2), (4, 5))]
Time complexity: O(n)
Auxiliary Space: O(n)
Method #4: Using the numpy library to remove duplicates:
Algorithm:
1.Initialize the input list test_list.
2.Convert the list to a numpy array using np.array().
3.Use the numpy’s unique() function to remove duplicates by specifying the axis as 0.
4.Convert the numpy array back to a list using tolist() function.
5.Print the unique list.
Python3
import numpy as np # initializing lists test_list = [[( 4 , 6 ), ( 3 , 2 )], [( 3 , 2 ), ( 4 , 5 )], [( 3 , 2 ), ( 4 , 5 )]] # convert the list to a numpy array arr = np.array(test_list) # use numpy's unique() function to remove duplicates unique_arr = np.unique(arr, axis = 0 ) # convert the numpy array back to a list of lists unique_list = unique_arr.tolist() # print the unique list print (unique_list) #This code is contributed by Jyothi pinjala |
Output:
The original list is : [[(4, 6), (3, 2)], [(3, 2), (4, 5)], [(3, 2), (4, 5)]] [[[3, 2], [4, 5]], [[4, 6], [3, 2]]]
Time complexity:
The time complexity of this algorithm is O(n log n), where n is the number of elements in the input list. The reason for this is that the np.unique() function uses sorting algorithms to find the unique elements, which takes O(n log n) time in the worst-case scenario.
Space complexity:
The space complexity of this algorithm is O(n), where n is the number of elements in the input list. This is because we are creating a numpy array and a list of the same size as the input list to store the unique elements.
Method 5: Using a loop to iterate over each tuple and checking if it has already been added to the unique list using a separate list.
Step-by-step approach:
- Initialize an empty list unique_list to store the unique tuples.
- Initialize an empty list seen to keep track of the tuples that have been seen already.
- Loop through each tuple in the original list test_list.
- If the tuple is not in the seen list, add it to both unique_list and seen.
- Print the unique_list.
Below is the implementation of the above approach:
Python3
# initializing lists test_list = [[( 4 , 6 ), ( 3 , 2 )], [( 3 , 2 ), ( 4 , 5 )], [( 3 , 2 ), ( 4 , 5 )]] # initialize an empty list to store unique tuples unique_list = [] # initialize an empty list to keep track of seen tuples seen = [] # loop through each tuple in the original list for tpl in test_list: # if the tuple has not been seen yet, add it to the unique list and seen list if tpl not in seen: unique_list.append(tpl) seen.append(tpl) # print the unique list print (unique_list) |
[[(4, 6), (3, 2)], [(3, 2), (4, 5)]]
Time complexity: O(n^2), where n is the length of the original list.
Auxiliary space: O(n), as we need to store the seen list.
Method#6: Using Recursive method.
Algorithm:
1. Initialize an empty list to store unique tuples
2. Initialize an empty list to keep track of seen tuples
3. Loop through each tuple in the original list:
a. If the tuple has not been seen yet, add it to the unique list and seen list
4. Return the unique list
Python3
def remove_duplicates_recursive(lst, unique_lst = [], seen = []): # base case: the list is empty if not lst: return unique_lst # check if the first tuple in the list has already been seen elif lst[ 0 ] not in seen: # if not, add it to the unique list and seen list unique_lst.append(lst[ 0 ]) seen.append(lst[ 0 ]) # recursively call the function with the rest of the list return remove_duplicates_recursive(lst[ 1 :], unique_lst, seen) test_list = [[( 4 , 6 ), ( 3 , 2 )], [( 3 , 2 ), ( 4 , 5 )], [( 3 , 2 ), ( 4 , 5 )]] unique_list = remove_duplicates_recursive(test_list) print (unique_list) |
[[(4, 6), (3, 2)], [(3, 2), (4, 5)]]
Time complexity:
– The loop iterates through each tuple in the list once, giving us a time complexity of O(n), where n is the length of the list.
– The `if` statement inside the loop has a time complexity of O(1).
– Therefore, the overall time complexity of the algorithm is O(n).
Space complexity:
– We create two lists to store unique tuples and seen tuples, respectively, so the space complexity of the algorithm is O(n), where n is the length of the list.
Method#7: Using heapq:
Algorithm:
- Initialize the input list
- Convert the list to a numpy array
- Use numpy’s unique() function to remove duplicates
- Convert the numpy array back to a list of lists
- Print the unique list
Python3
import heapq # initializing lists test_list = [[( 4 , 6 ), ( 3 , 2 )], [( 3 , 2 ), ( 4 , 5 )], [( 3 , 2 ), ( 4 , 5 )]] # printing original list print ( "The original list is : " + str (test_list)) # create a set of tuples from the original list tuple_set = set ( map ( tuple , test_list)) # convert the set back to a list of lists unique_list = [ list (t) for t in tuple_set] # print the unique list print (unique_list) #This code is contributed by Rayudu. |
The original list is : [[(4, 6), (3, 2)], [(3, 2), (4, 5)], [(3, 2), (4, 5)]] [[(3, 2), (4, 5)], [(4, 6), (3, 2)]]
Time complexity:
The time complexity of this algorithm is O(n log n), where n is the number of elements in the input list. This is because the numpy unique() function uses a sorting algorithm, which has a time complexity of O(n log n).
Auxiliary Space:
The space complexity of this algorithm is O(n), where n is the number of elements in the input list. This is because we are converting the input list to a numpy array, which takes up space in memory. Additionally, the numpy unique() function creates a new array to store the unique elements, which also takes up space in memory. Finally, we convert the numpy array back to a list of lists, which also takes up space in memory.