Problems associated with sorting and removal of duplicates is quite common in development domain and general coding as well. The sorting by frequency has been discussed, but sometimes, we even wish to remove the duplicates without using more LOC’s and in a shorter way. Let’s discuss certain ways in which this can be done.
Method #1 : Using count() + set() + sorted() The sorted function can be used to sort the elements as desired, the frequency can be computed using the count function and removal of duplicates can be handled using the set function.
Python3
# Python3 code to demonstrate # sorting and removal of duplicates # Using sorted() + set() + count() # initializing list test_list = [ 5 , 6 , 2 , 5 , 3 , 3 , 6 , 5 , 5 , 6 , 5 ] # printing original list print ( "The original list : " + str (test_list)) # using sorted() + set() + count() # sorting and removal of duplicates res = sorted ( set (test_list), key = lambda ele: test_list.count(ele)) # print result print ( "The list after sorting and removal : " + str (res)) |
The original list : [5, 6, 2, 5, 3, 3, 6, 5, 5, 6, 5] The list after sorting and removal : [2, 3, 6, 5]
Time Complexity: O(nlogn), where n is the number of elements in the list “test_list”.
Auxiliary Space: O(n), where n is the number of elements in the list “test_list”.
Method #2 : Using Counter.most_common() + list comprehension If one has a particular use case of sorting by the decreasing order of frequency, one can also use most-common function of Counter library to get frequency part.
Python3
# Python3 code to demonstrate # sorting and removal of duplicates # Using Counter.most_common() + list comprehension from collections import Counter # initializing list test_list = [ 5 , 6 , 2 , 5 , 3 , 3 , 6 , 5 , 5 , 6 , 5 ] # printing original list print ( "The original list : " + str (test_list)) # using Counter.most_common() + list comprehension # sorting and removal of duplicates res = [key for key, value in Counter(test_list).most_common()] # print result print ( "The list after sorting and removal : " + str (res)) |
The original list : [5, 6, 2, 5, 3, 3, 6, 5, 5, 6, 5] The list after sorting and removal : [5, 6, 3, 2]
Method #3 : Using itertools
To sort a list by frequency and remove duplicates using the itertools library in Python, you can do the following:
Python3
from itertools import groupby # Initialize the list test_list = [ 5 , 6 , 2 , 5 , 3 , 3 , 6 , 5 , 5 , 6 , 5 ] # printing original list print ( "The original list : " + str (test_list)) # Group the elements in the list by their frequency groups = [( len ( list (group)), key) for key, group in groupby( sorted (test_list))] # Sort the groups by frequency in descending order groups.sort(reverse = True ) # Create a new list with the elements in each group, starting with the group with the highest frequency res = [key for count, key in groups] # print result print ( "The list after sorting and removal : " + str (res[:: - 1 ])) #This code is contributed by Edula Vinay Kumar Reddy |
The original list : [5, 6, 2, 5, 3, 3, 6, 5, 5, 6, 5] The list after sorting and removal : [2, 3, 6, 5]
Explanation:
- First, we import the groupby function from the itertools library.
- Then, we initialize the list and group the elements in the list by their frequency using the groupby function. We pass the sorted list to the groupby function to ensure that the elements are grouped correctly.
- Next, we create a list of tuples where each tuple consists of the frequency of an element and the element itself. We use the len function to get the frequency of each element and the key variable to get the element itself.
- We sort the list of tuples by frequency in descending order using the sort function with the reverse parameter set to True.
- Finally, we create a new list with the elements in each group, starting with the group with the highest frequency. We use a list comprehension to iterate over the tuples in the sorted list and extract the element from each tuple.
The time complexity of this approach is O(n log n), as the groupby function has a time complexity of O(n) and the sort function has a time complexity of O(n log n).
The auxiliary space of this approach is O(n), as we create a new list with the same number of elements as the original list.
Method #4 : Using operator.countOf() + set() + sorted() The sorted function can be used to sort the elements as desired, the frequency can be computed using the countOf function and removal of duplicates can be handled using the set function.
Python3
# Python3 code to demonstrate # sorting and removal of duplicates # Using sorted() + set() + operator.countOf() import operator as op # initializing list test_list = [ 5 , 6 , 2 , 5 , 3 , 3 , 6 , 5 , 5 , 6 , 5 ] # printing original list print ( "The original list : " + str (test_list)) # using sorted() + set() + operator.countOf() # sorting and removal of duplicates res = sorted ( set (test_list), key = lambda ele: op.countOf(test_list,ele)) # print result print ( "The list after sorting and removal : " + str (res)) |
The original list : [5, 6, 2, 5, 3, 3, 6, 5, 5, 6, 5] The list after sorting and removal : [2, 3, 6, 5]
Time Complexity: O(n)
Auxiliary Space: O(n)