Pandas Groupby and Computing Mean

27 July 2024

1

Pandas is an open-source library that is built on top of NumPy library. It is a Python package that offers various data structures and operations for manipulating numerical data and time series. It is mainly popular for importing and analyzing data much easier. Pandas is fast and it has high-performance & productivity for users.

Groupby is a pretty simple concept. We can create a grouping of categories and apply a function to the categories. It’s a simple concept but it’s an extremely valuable technique that’s widely used in data science. It is helpful in the sense that we can :

Compute summary statistics for every group
Perform group-specific transformations
Do the filtration of data

The groupby() involves a combination of splitting the object, applying a function, and combining the results. This can be used to group large amounts of data and compute operations on these groups.

Example 1:

Python3

# import required module 
import pandas as pd 
  
# create dataframe 
df = pd.DataFrame({'Animal': ['Falcon', 'Falcon', 'Parrot', 'Parrot'], 
  
                   'Max Speed': [380., 370., 24., 26.]}) 
  
# use groupby() to compute mean 
df.groupby(['Animal']).mean() 

Output

Example 2:

Python3

# import required module 
import pandas as pd 
  
# assign list 
l = [[100, 200, 300], [10, None, 40], [20, 10, 30], [100, 200, 200]] 
  
# create dataframe 
df = pd.DataFrame(l, columns=["a", "b", "c", ]) 
  
# use groupby() to generate mean 
df.groupby(by=["b"]).mean() 

Output:

Example 3:

Python3

# import required module 
import pandas as pd 
  
# assign data 
ipl_data = {'Team': ['Riders', 'Riders', 'Devils', 'Devils', 'Kings',  'kings', 'Kings', 'Kings', 'Riders', 'Royals', 'Royals', 'Riders'], 
  
            'Rank': [1, 2, 2, 3, 3, 4, 1, 1, 2, 4, 1, 2], 
  
            'Year': [2014, 2015, 2014, 2015, 2014, 2015, 2016, 2017, 2016, 2014, 2015, 2017], 
  
            'Points': [876, 789, 863, 673, 741, 812, 756, 788, 694, 701, 804, 690]} 
  
# create dataframe 
df = pd.DataFrame(ipl_data) 
  
# use groupby() to generate mean 
df.groupby(['Team']).mean() 

Output:

Pandas Groupby and Computing Mean

Python3

Python3

Python3

Java Program for Longest Common Subsequence

Maximum height of Tree when any Node can be considered as Root

Print Fibonacci sequence using 2 variables

LEAVE A REPLY Cancel reply

Most Popular

5 Best VPNs for Ubuntu in 2025: Tested & Confirmed by Gjurgjica Panova

Inside Bug Hunters: Jay Arora on Smart Testing, AI, and Security at Scale by

How to Use WhatsApp in China in 2025: Full Guide by Gjurgjica Panova

Redefining Digital Forensics: Inside Proven Data’s Transformation by

Recent Comments

EDITOR PICKS

5 Best VPNs for Ubuntu in 2025: Tested & Confirmed by Gjurgjica Panova

Inside Bug Hunters: Jay Arora on Smart Testing, AI, and Security at Scale by

How to Use WhatsApp in China in 2025: Full Guide by Gjurgjica Panova

POPULAR POSTS

5 Best VPNs for Ubuntu in 2025: Tested & Confirmed by Gjurgjica Panova

Inside Bug Hunters: Jay Arora on Smart Testing, AI, and Security at Scale by

How to Use WhatsApp in China in 2025: Full Guide by Gjurgjica Panova

POPULAR CATEGORY

ABOUT US

FOLLOW US