This article focuses on one of the operations of getting a unique list from a list that contains a possible duplicate. Removing duplicates from list operation has a large number of applications and hence, its knowledge is good to have in Python.
Ways to Remove duplicates from the list:
Below are the methods that we will cover in this article:
- Using set() method
- Using list comprehension
- Using list comprehension with enumerate()
- Using collections.OrderedDict.fromkeys()
- Using in, not in operators
- Using list comprehension and Array.index() method
- Using Counter() method
- Using Numpy unique method
- Using a Pandas DataFrame
Remove Duplicates from the list using the set() Method
This is the most popular way by which the duplicates are removed from the list is set() method. But the main and notable drawback of this approach is that the ordering of the element is lost in this particular method.
Python3
# initializing list test_list = [ 1 , 5 , 3 , 6 , 3 , 5 , 6 , 1 ] print ( "The original list is : " + str (test_list)) # using set() to remove duplicated from list test_list = list ( set (test_list)) # printing list after removal # distorted ordering print ( "The list after removing duplicates : " + str (test_list)) |
The original list is : [1, 5, 3, 6, 3, 5, 6, 1] The list after removing duplicates : [1, 3, 5, 6]
Time Complexity: O(n)
Space Complexity: O(n)
Remove duplicates from the list using list comprehension
This method has working similarly to the above method, but this is just a one-liner shorthand of a longer method done with the help of list comprehension. order
Python3
# initializing list test_list = [ 1 , 3 , 5 , 6 , 3 , 5 , 6 , 1 ] print ( "The original list is : " + str (test_list)) # using list comprehension to remove duplicated from list res = [] [res.append(x) for x in test_list if x not in res] # printing list after removal print ( "The list after removing duplicates : " + str (res)) |
The original list is : [1, 3, 5, 6, 3, 5, 6, 1] The list after removing duplicates : [1, 3, 5, 6]
Time Complexity: O(n)
Space Complexity: O(n)
Remove duplicates from the list using list comprehension with enumerate()
The list comprehension coupled with enumerate function can also achieve this task. It basically looks for already occurred elements and skips adding them. It preserves the list order.
Python3
# initializing list test_list = [ 1 , 5 , 3 , 6 , 3 , 5 , 6 , 1 ] print ( "The original list is : " + str (test_list)) # using list comprehension + enumerate() to remove duplicated from list res = [i for n, i in enumerate (test_list) if i not in test_list[:n]] # printing list after removal print ( "The list after removing duplicates : " + str (res)) |
The original list is : [1, 5, 3, 6, 3, 5, 6, 1] The list after removing duplicates : [1, 5, 3, 6]
Time Complexity: O(n^2)
Space Complexity: O(n)
Remove duplicates from the list in python using collections.OrderedDict.fromkeys()
This is the fastest method to achieve a particular task. It first removes the duplicates and returns a dictionary that has to be converted to a list. This works well in the case of strings also.
Python3
# using collections.OrderedDict.fromkeys() from collections import OrderedDict # initializing list test_list = [ 1 , 5 , 3 , 6 , 3 , 5 , 6 , 1 ] print ( "The original list is : " + str (test_list)) # using collections.OrderedDict.fromkeys() to remove duplicated from list res = list (OrderedDict.fromkeys(test_list)) # printing list after removal print ( "The list after removing duplicates : " + str (res)) |
The original list is : [1, 5, 3, 6, 3, 5, 6, 1] The list after removing duplicates : [1, 5, 3, 6]
Time Complexity: O(n)
Space Complexity: O(n)
Remove duplicates from list using “in”, “not in” operators
In this, we iterate through the list and maintain a corresponding list with it which holds the element of the input list and before appending the new element to the corresponding list we check whether the element already exists or not in the corresponding list and by this way we can remove the duplicate of the input list.
Python3
# initializing list test_list = [ 1 , 5 , 3 , 6 , 3 , 5 , 6 , 1 ] print ( "The original list is : " + str (test_list)) res = [] for i in test_list: if i not in res: res.append(i) # printing list after removal print ( "The list after removing duplicates : " + str (res)) |
The original list is : [1, 5, 3, 6, 3, 5, 6, 1] The list after removing duplicates : [1, 5, 3, 6]
Time Complexity: O(n^2)
Space Complexity: O(n)
Remove duplicates from list using list comprehension and Array.index() method
In this method, we use list comprehension to iterate over the list and array indexing to get the item from an array. We add the items to the array only if the first index of an element in the array matches the current index of the element or else neglects the element.
Python
# initializing list arr = [ 1 , 5 , 3 , 6 , 3 , 5 , 6 , 1 ] print ( 'The original list is : ' + str (arr)) # using list comprehension + arr.index() res = [arr[i] for i in range ( len (arr)) if i = = arr.index(arr[i]) ] # printing list after removal of duplicate print ( 'The list after removing duplicates :' ,res) |
The original list is : [1, 5, 3, 6, 3, 5, 6, 1] ('The list after removing duplicates :', [1, 5, 3, 6])
Time Complexity: O(n^2)
Space Complexity: O(n)
Remove duplicates from list using the or Counter() method
In this method, we use the Counter() method to make a dictionary from a given array. Now retrieve all the keys using keys() method which gives only unique values from the previous list.
Python3
from collections import Counter # initializing list arr = [ 1 , 5 , 3 , 6 , 3 , 5 , 6 , 1 ] print ( 'The original list is : ' + str (arr)) # using Counter() + keys() to remove duplicated from list temp = Counter(arr) res = [ * temp] # printing list after removal of duplicate print ( 'The list after removing duplicates :' ,res) |
The original list is : [1, 5, 3, 6, 3, 5, 6, 1] The list after removing duplicates : [1, 5, 3, 6]
Time Complexity: O(n)
Space Complexity: O(n)
Remove duplicates from list using numpy unique method
This method is used when the list contains elements of the same type and is used to remove duplicates from the list. It first converts the list into a numpy array and then uses the numpy unique() method to remove all the duplicate elements from the list.
Note: Install numpy module using the command “pip install numpy”duplicate
Python3
# initializing list test_list = [ 1 , 5 , 3 , 6 , 3 , 5 , 6 , 1 ] print ( "The original list is : " + str (test_list)) # using numpy import numpy as np # removing duplicated from list res = np.unique(test_list) # printing list after removal print ( "The list after removing duplicates : " + str (res)) |
Output
The original list is : [1, 5, 3, 6, 3, 5, 6, 1]
The list after removing duplicates : [1 3 5 6]
Time Complexity: O(n)
Space Complexity: O(n)
Using pandas data frame
The pandas.DataFrame.drop_duplicates() method can also be used to remove duplicates from a list. The method returns a new DataFrame with duplicates removed, and the original data frame data frame remains unchanged.
Algorithm:
Create a pandas data frame with the list. Use the drop_duplicates() method on the DataFram and then convert the resulting DataFrame to a list.
Python3
import pandas as pd # initializing list test_list = [ 1 , 5 , 3 , 6 , 3 , 5 , 6 , 1 ] print ( "The original list is : " + str (test_list)) # creating DataFrame df = pd.DataFrame({ 'col' : test_list}) # using drop_duplicates() method df.drop_duplicates(inplace = True ) # converting back to list res = df[ 'col' ].tolist() # printing list after removal print ( "The list after removing duplicates : " + str (res)) |
Output:
The original list is : [1, 5, 3, 6, 3, 5, 6, 1]
The list after removing duplicates : [1 , 5 , 3, 6]
Time complexity: The time complexity of the drop_duplicates() method is O(n log n) as it sorts the values before removing duplicates. The conversion from DataFrame to a list takes O(n) time. Therefore, the overall time complexity of this method is O(n log n).
Space complexity: The space complexity of this method is O(n) because a new DataFrame and a list are created, each with n elements.