Python | Aggregate values by tuple keys

26 July 2024

0

Sometimes, while working with records, we can have a problem in which we need to group the like keys and aggregate the values of like keys. This can have application in any kind of scoring. Let’s discuss certain ways in which this task can be performed.

Method #1 : Using Counter() + generator expression The combination of above functions can be used to perform this particular task. In this, we need to first combine the like key elements and task of aggregation is performed by Counter().

Python3

# Python3 code to demonstrate working of
# Aggregate values by tuple keys
# using Counter() + generator expression
from collections import Counter
 
# initialize list
test_list = [('gfg', 50), ('is', 30), ('best', 100), 
                          ('gfg', 20), ('best', 50)]
 
# printing original list
print("The original list is : " + str(test_list))
 
# Aggregate values by tuple keys
# using Counter() + generator expression
res = list(Counter(key for key, num in test_list 
                  for idx in range(num)).items())
 
# printing result
print("List after grouping : " + str(res))

Output :

The original list is : [('gfg', 50), ('is', 30), ('best', 100), ('gfg', 20), ('best', 50)]
List after grouping : [('best', 150), ('gfg', 70), ('is', 30)]

Time Complexity: O(n*n) where n is the number of elements in the list “test_list”. Counter() + generator expression performs n*n number of operations.
Auxiliary Space: O(n), extra space is required where n is the number of elements in the list

Method #2 : Using groupby() + map() + itemgetter() + sum() The combination of above functions can also be used to perform this particular task. In this, we group the elements using groupby(), decision of key’s index is given by itemgetter. Task of addition(aggregation) is performed by sum() and extension of logic to all tuples is handled by map().

Python3

# Python3 code to demonstrate working of
# Aggregate values by tuple keys
# using groupby() + map() + itemgetter() + sum()
from itertools import groupby
from operator import itemgetter
 
# initialize list
test_list = [('gfg', 50), ('is', 30), ('best', 100),
                          ('gfg', 20), ('best', 50)]
 
# printing original list
print("The original list is : " + str(test_list))
 
# Aggregate values by tuple keys
# using groupby() + map() + itemgetter() + sum()
res = [(key, sum(map(itemgetter(1), ele)))
       for key, ele in groupby(sorted(test_list, key = itemgetter(0)), 
                                                key = itemgetter(0))]
 
# printing result
print("List after grouping : " + str(res))

Output :

The original list is : [('gfg', 50), ('is', 30), ('best', 100), ('gfg', 20), ('best', 50)]
List after grouping : [('best', 150), ('gfg', 70), ('is', 30)]

Time Complexity: O(n*n), where n is the length of the list test_list
Auxiliary Space: O(n) additional space of size n is created where n is the number of elements in the res list

Method #3: Using reduce():

Algorithm:

Import the required modules, functools, itertools and operator.
Initialize the given list of tuples.
Use the reduce function to iterate through the list of tuples, filtering the tuples with the same first element and summing their second element.
Append the tuples obtained from step 3 to the accumulator list, if the first element of the tuple is not
present in the accumulator list, otherwise return the accumulator list unchanged.
Finally, print the list after grouping.

Python3

from functools import reduce
from itertools import groupby
from operator import itemgetter
 
# initialize list
test_list = [('gfg', 50), ('is', 30), ('best', 100),
             ('gfg', 20), ('best', 50)]
 
# printing original list
print("The original list is : " + str(test_list))
 
# use reduce() to aggregate values by tuple keys
res = reduce(lambda acc, x: acc + [(x[0],
        sum(map(itemgetter(1), filter(lambda y: y[0] ==
        x[0], test_list))))] if x[0] not in [elem[0] for elem in acc]
        else acc, test_list, [])
 
# printing result
print("List after grouping : " + str(res))
# This code is contributed by Jyothi pinjala.

Output

The original list is : [('gfg', 50), ('is', 30), ('best', 100), ('gfg', 20), ('best', 50)]
List after grouping : [('gfg', 70), ('is', 30), ('best', 150)]

Time Complexity: O(nlogn), where n is the length of the input list. This is due to the sorting operation performed by the groupby function.
Auxiliary Space: O(n), where n is the length of the input list. This is due to the list created by the reduce function to store the output tuples.

METHOD 4:Using dictionary.

APPROACH:

The program takes a list of tuples as input and aggregates the values by the tuple keys. In other words, it groups the values of tuples with the same key and sums their values.

ALGORITHM:

1.Initialize an empty dictionary d.
2.Loop through each tuple in the list:
a.Check if the key of the tuple is already present in the dictionary.
b.If the key is present, add the value of the tuple to the existing value of the key in the dictionary.
c.If the key is not present, add the key-value pair to the dictionary.
5.Convert the dictionary to a list of tuples using the items() method.
6.Print the list.

Python3

# Input
lst = [('gfg', 50), ('is', 30), ('best', 100), ('gfg', 20), ('best', 50)]
 
# Aggregate values using a dictionary
d = {}
for key, value in lst:
    if key in d:
        d[key] += value
    else:
        d[key] = value
 
# Convert dictionary to list of tuples
result = list(d.items())
 
# Output
print("List after grouping :", result)

Output

List after grouping : [('gfg', 70), ('is', 30), ('best', 150)]

Time Complexity:

The time complexity of this program is O(n), where n is the length of the input list.

Space Complexity:

The space complexity of this program is O(m), where m is the number of unique keys in the input list. This is because the program creates a dictionary to store the keys and their corresponding values.

Python | Aggregate values by tuple keys

Python3

Python3

Python3

Python3

Java Program for Longest Common Subsequence

Maximum height of Tree when any Node can be considered as Root

Print Fibonacci sequence using 2 variables

LEAVE A REPLY Cancel reply

Most Popular

8 Best VPNs for Apple TV in 2024: Fast & Secure by Penka Hristovska

Samsung offers free screen replacements for users still suffering green line issues

7 Best Free Antiviruses for Mac in 2024: Are They Any Good? by Katarina Glamoslija

Is Microsoft Teams Secure? Use Teams Safely in 2024 by Tyler Cross

Recent Comments

EDITOR PICKS

8 Best VPNs for Apple TV in 2024: Fast & Secure by Penka Hristovska

Samsung offers free screen replacements for users still suffering green line issues

7 Best Free Antiviruses for Mac in 2024: Are They Any Good? by Katarina Glamoslija

POPULAR POSTS

8 Best VPNs for Apple TV in 2024: Fast & Secure by Penka Hristovska

Samsung offers free screen replacements for users still suffering green line issues

7 Best Free Antiviruses for Mac in 2024: Are They Any Good? by Katarina Glamoslija

POPULAR CATEGORY

ABOUT US

FOLLOW US