Saturday, November 16, 2024
Google search engine
HomeLanguagesPython – Ways to remove duplicates from list

Python – Ways to remove duplicates from list

This article focuses on one of the operations of getting a unique list from a list that contains a possible duplicate. Removing duplicates from list operation has a large number of applications and hence, its knowledge is good to have in Python. 

Ways to Remove duplicates from the list:

Below are the methods that we will cover in this article:

Remove Duplicates from the list using the set() Method

This is the most popular way by which the duplicates are removed from the list is set() method. But the main and notable drawback of this approach is that the ordering of the element is lost in this particular method.  

Python3




# initializing list
test_list = [1, 5, 3, 6, 3, 5, 6, 1]
print ("The original list is : "
        + str(test_list))
 
# using set() to remove duplicated from list
test_list = list(set(test_list))
 
# printing list after removal
# distorted ordering
print ("The list after removing duplicates : "
        + str(test_list))


Output

The original list is : [1, 5, 3, 6, 3, 5, 6, 1]
The list after removing duplicates : [1, 3, 5, 6]


Time Complexity: O(n)
Space Complexity: O(n)

Remove duplicates from the list using list comprehension 

This method has working similarly to the above method, but this is just a one-liner shorthand of a longer method done with the help of list comprehension.  order

Python3




# initializing list
test_list = [1, 3, 5, 6, 3, 5, 6, 1]
print("The original list is : "
      + str(test_list))
 
# using list comprehension to remove duplicated from list
res = []
[res.append(x) for x in test_list if x not in res]
 
# printing list after removal
print ("The list after removing duplicates : "
       + str(res))


Output

The original list is : [1, 3, 5, 6, 3, 5, 6, 1]
The list after removing duplicates : [1, 3, 5, 6]


Time Complexity: O(n)
Space Complexity: O(n)

Remove duplicates from the list using list comprehension with enumerate() 

The list comprehension coupled with enumerate function can also achieve this task. It basically looks for already occurred elements and skips adding them. It preserves the list order.  

Python3




# initializing list
test_list = [1, 5, 3, 6, 3, 5, 6, 1]
print ("The original list is : "
        + str(test_list))
 
# using list comprehension + enumerate() to remove duplicated from list
res = [i for n, i in enumerate(test_list) if i not in test_list[:n]]
 
# printing list after removal
print ("The list after removing duplicates : "
        + str(res))


Output

The original list is : [1, 5, 3, 6, 3, 5, 6, 1]
The list after removing duplicates : [1, 5, 3, 6]


Time Complexity: O(n^2)
Space Complexity: O(n)

Remove duplicates from the list in python using collections.OrderedDict.fromkeys()

This is the fastest method to achieve a particular task. It first removes the duplicates and returns a dictionary that has to be converted to a list. This works well in the case of strings also. 

Python3




# using collections.OrderedDict.fromkeys()
from collections import OrderedDict
 
# initializing list
test_list = [1, 5, 3, 6, 3, 5, 6, 1]
print ("The original list is : "
       + str(test_list))
 
# using collections.OrderedDict.fromkeys() to remove duplicated from list
res = list(OrderedDict.fromkeys(test_list))
 
# printing list after removal
print ("The list after removing duplicates : "
       + str(res))


Output

The original list is : [1, 5, 3, 6, 3, 5, 6, 1]
The list after removing duplicates : [1, 5, 3, 6]


Time Complexity: O(n)
Space Complexity: O(n)

Remove duplicates from list using “in”, “not in” operators

In this, we iterate through the list and maintain a corresponding list with it which holds the element of the input list and before appending the new element to the corresponding list we check whether the element already exists or not in the corresponding list and by this way we can remove the duplicate of the input list.

Python3




# initializing list
test_list = [1, 5, 3, 6, 3, 5, 6, 1]
print("The original list is : " + str(test_list))
 
res = []
for i in test_list:
    if i not in res:
        res.append(i)
 
# printing list after removal
print("The list after removing duplicates : " + str(res))


Output

The original list is : [1, 5, 3, 6, 3, 5, 6, 1]
The list after removing duplicates : [1, 5, 3, 6]


Time Complexity: O(n^2)
Space Complexity: O(n)

Remove duplicates from list using list comprehension and Array.index() method

In this method, we use list comprehension to iterate over the list and array indexing to get the item from an array. We add the items to the array only if the first index of an element in the array matches the current index of the element or else neglects the element.

Python




# initializing list
arr = [1, 5, 3, 6, 3, 5, 6, 1]
print ('The original list is : '+ str(arr))
 
# using list comprehension + arr.index()
res = [arr[i] for i in range(len(arr)) if i == arr.index(arr[i]) ]
 
# printing list after removal of duplicate
print('The list after removing duplicates :'
        ,res)


Output

The original list is : [1, 5, 3, 6, 3, 5, 6, 1]
('The list after removing duplicates :', [1, 5, 3, 6])


Time Complexity: O(n^2)
Space Complexity: O(n)

Remove duplicates from list using the or Counter() method

In this method, we use the Counter() method to make a dictionary from a given array. Now retrieve all the keys using keys() method which gives only unique values from the previous list. 

Python3




from collections import Counter
 
# initializing list
arr = [1, 5, 3, 6, 3, 5, 6, 1]
print ('The original list is : '+ str(arr))
 
# using Counter() + keys() to remove duplicated from list
temp = Counter(arr)
res = [*temp]
 
# printing list after removal of duplicate
print('The list after removing duplicates :'
        ,res)


Output

The original list is : [1, 5, 3, 6, 3, 5, 6, 1]
The list after removing duplicates : [1, 5, 3, 6]


Time Complexity: O(n)
Space Complexity: O(n)

Remove duplicates from list using numpy unique method

This method is used when the list contains elements of the same type and is used to remove duplicates from the list. It first converts the list into a numpy array and then uses the numpy unique() method to remove all the duplicate elements from the list. 

Note: Install numpy module using the command “pip install numpy”duplicate

Python3




# initializing list
test_list = [1, 5, 3, 6, 3, 5, 6, 1]
print ("The original list is : "
        + str(test_list))
  
# using numpy
import numpy as np
  
# removing duplicated from list
res = np.unique(test_list)
  
# printing list after removal
print ("The list after removing duplicates : "
        + str(res))


Output

The original list is : [1, 5, 3, 6, 3, 5, 6, 1]
The list after removing duplicates : [1 3 5 6]

Time Complexity: O(n)
Space Complexity: O(n)

Using pandas data frame

The pandas.DataFrame.drop_duplicates() method can also be used to remove duplicates from a list. The method returns a new DataFrame with duplicates removed, and the original data frame data frame remains unchanged.

Algorithm:

Create a pandas data frame with the list. Use the drop_duplicates() method on the DataFram and then convert the resulting DataFrame to a list.

Python3




import pandas as pd
 
# initializing list
test_list = [1, 5, 3, 6, 3, 5, 6, 1]
print("The original list is : " + str(test_list))
 
# creating DataFrame
df = pd.DataFrame({'col': test_list})
 
# using drop_duplicates() method
df.drop_duplicates(inplace=True)
 
# converting back to list
res = df['col'].tolist()
 
# printing list after removal
print("The list after removing duplicates : " + str(res))


Output:

The original list is : [1, 5, 3, 6, 3, 5, 6, 1]
The list after removing duplicates : [1 , 5 , 3, 6]

Time complexity: The time complexity of the drop_duplicates() method is O(n log n) as it sorts the values before removing duplicates. The conversion from DataFrame to a list takes O(n) time. Therefore, the overall time complexity of this method is O(n log n).

Space complexity: The space complexity of this method is O(n) because a new DataFrame and a list are created, each with n elements.

RELATED ARTICLES

Most Popular

Recent Comments