Count NaN or missing values in Pandas DataFrame

28 July 2024

3

In this article, we will see how to Count NaN or missing values in Pandas DataFrame using isnull() and sum() method of the DataFrame.

Dataframe.isnull() method

Pandas isnull() function detect missing values in the given object. It return a boolean same-sized object indicating if the values are NA. Missing values gets mapped to True and non-missing value gets mapped to False.

Syntax: DataFrame.isnull()

Parameters: None

Return Type: Dataframe of Boolean values which are True for NaN values otherwise False.

dataframe.sum() method

Pandas sum() function return the sum of the values for the requested axis. If the input is index axis then it adds all the values in a column and repeats the same for all the columns and returns a series containing the sum of all the values in each column. It also provides support to skip the missing values while calculating the.

Syntax: DataFrame.sum(axis=None, skipna=None, level=None, numeric_only=None, min_count=0, **kwargs)

Parameters :

axis : {index (0), columns (1)}

skipna : Exclude NA/null values when computing the result.

level : If the axis is a MultiIndex (hierarchical), count along a particular level, collapsing into a Series

numeric_only : Include only float, int, boolean columns. If None, will attempt to use everything, then use only numeric data. Not implemented for Series.

min_count : The required number of valid values to perform the operation. If fewer than min_count non-NA values are present the result will be NA.

Returns : sum of Series or DataFrame (if level specified).

Let’s create a pandas dataframe.

# import numpy library as np
import numpy as np
  
# import pandas library as pd
import pandas as pd
  
# List of Tuples
students = [('Ankit', 22, 'Up', 'Geu'),
           ('Ankita', np.NaN, 'Delhi', np.NaN),
           ('Rahul', 16, 'Tokyo', 'Abes'),
           ('Simran', 41, 'Delhi', 'Gehu'),
           ('Shaurya', np.NaN, 'Delhi', 'Geu'),
           ('Shivangi', 35, 'Mumbai', np.NaN ),
           ('Swapnil', 35, np.NaN, 'Geu'),
           (np.NaN, 35, 'Uk', 'Geu'),
           ('Jeet', 35, 'Guj', 'Gehu'),
           (np.NaN, np.NaN, np.NaN, np.NaN)
            ]
  
# Create a DataFrame object from
# list of tuples with columns
# and indices.
details = pd.DataFrame(students, columns =['Name', 'Age', 
                                           'Place', 'College'],
                        index =['a', 'b', 'c', 'd', 'e', 
                                'f', 'g', 'i', 'j', 'k'])
  
details

Output:

Example 1 : Count total NaN at each column in DataFrame.

# import numpy library as np
import numpy as np
  
# import pandas library as pd
import pandas as pd
  
  
# List of Tuples
students = [('Ankit', 22, 'Up', 'Geu'),
           ('Ankita', np.NaN, 'Delhi', np.NaN),
           ('Rahul', 16, 'Tokyo', 'Abes'),
           ('Simran', 41, 'Delhi', 'Gehu'),
           ('Shaurya', np.NaN, 'Delhi', 'Geu'),
           ('Shivangi', 35, 'Mumbai', np.NaN ),
           ('Swapnil', 35, np.NaN, 'Geu'),
           (np.NaN, 35, 'Uk', 'Geu'),
           ('Jeet', 35, 'Guj', 'Gehu'),
           (np.NaN, np.NaN, np.NaN, np.NaN)
            ]
  
# Create a DataFrame object from list of tuples 
# with columns and indices.
details = pd.DataFrame(students, columns =['Name', 'Age',
                                           'Place', 'College'],
                        index =['a', 'b', 'c', 'd', 'e', 
                                'f', 'g', 'i', 'j', 'k'])
  
# show the boolean dataframe            
print(" \nshow the boolean Dataframe : \n\n", details.isnull())
  
# Count total NaN at each column in a DataFrame
print(" \nCount total NaN at each column in a DataFrame : \n\n",
      details.isnull().sum())

Output:

Example 2 : Count total NaN at each row in DataFrame .

# import numpy library as np
import numpy as np
  
# import pandas library as pd
import pandas as pd
  
  
# List of Tuples
students = [('Ankit', 22, 'Up', 'Geu'),
           ('Ankita', np.NaN, 'Delhi', np.NaN),
           ('Rahul', 16, 'Tokyo', 'Abes'),
           ('Simran', 41, 'Delhi', 'Gehu'),
           ('Shaurya', np.NaN, 'Delhi', 'Geu'),
           ('Shivangi', 35, 'Mumbai', np.NaN ),
           ('Swapnil', 35, np.NaN, 'Geu'),
           (np.NaN, 35, 'Uk', 'Geu'),
           ('Jeet', 35, 'Guj', 'Gehu'),
           (np.NaN, np.NaN, np.NaN, np.NaN)
            ]
  
# Create a DataFrame object from
# list of tuples with columns
# and indices.
details = pd.DataFrame(students, columns =['Name', 'Age', 
                                           'Place', 'College'],
                        index =['a', 'b', 'c', 'd', 'e',
                                'f', 'g', 'i', 'j', 'k'])
  
# show the boolean dataframe            
print(" \nshow the boolean Dataframe : \n\n", details.isnull())
  
# index attribute of a dataframe
# gives index list 
  
# Count total NaN at each row in a DataFrame
for i in range(len(details.index)) :
    print(" Total NaN in row", i + 1, ":",
          details.iloc[i].isnull().sum())

Output:

Example 3 : Count total NaN in DataFrame .

# import numpy library as np
import numpy as np
  
# import pandas library as pd
import pandas as pd
  
  
# List of Tuples
students = [('Ankit', 22, 'Up', 'Geu'),
           ('Ankita', np.NaN, 'Delhi', np.NaN),
           ('Rahul', 16, 'Tokyo', 'Abes'),
           ('Simran', 41, 'Delhi', 'Gehu'),
           ('Shaurya', np.NaN, 'Delhi', 'Geu'),
           ('Shivangi', 35, 'Mumbai', np.NaN ),
           ('Swapnil', 35, np.NaN, 'Geu'),
           (np.NaN, 35, 'Uk', 'Geu'),
           ('Jeet', 35, 'Guj', 'Gehu'),
           (np.NaN, np.NaN, np.NaN, np.NaN)
            ]
  
# Create a DataFrame object from
# list of tuples with columns
# and indices.
details = pd.DataFrame(students, columns =['Name', 'Age', 
                                           'Place', 'College'],
                        index =['a', 'b', 'c', 'd', 'e',
                                'f', 'g', 'i', 'j', 'k'])
  
# show the boolean dataframe            
print(" \nshow the boolean Dataframe : \n\n", details.isnull())
  
# Count total NaN in a DataFrame
print(" \nCount total NaN in a DataFrame : \n\n",
       details.isnull().sum().sum())

Output:

Count NaN or missing values in Pandas DataFrame

Dataframe.isnull() method

dataframe.sum() method

Java Program for Longest Common Subsequence

Maximum height of Tree when any Node can be considered as Root

Print Fibonacci sequence using 2 variables

LEAVE A REPLY Cancel reply

Most Popular

8 Best VPNs for Apple TV in 2024: Fast & Secure by Penka Hristovska

Samsung offers free screen replacements for users still suffering green line issues

7 Best Free Antiviruses for Mac in 2024: Are They Any Good? by Katarina Glamoslija

Is Microsoft Teams Secure? Use Teams Safely in 2024 by Tyler Cross

Recent Comments

EDITOR PICKS

8 Best VPNs for Apple TV in 2024: Fast & Secure by Penka Hristovska

Samsung offers free screen replacements for users still suffering green line issues

7 Best Free Antiviruses for Mac in 2024: Are They Any Good? by Katarina Glamoslija

POPULAR POSTS

8 Best VPNs for Apple TV in 2024: Fast & Secure by Penka Hristovska

Samsung offers free screen replacements for users still suffering green line issues

7 Best Free Antiviruses for Mac in 2024: Are They Any Good? by Katarina Glamoslija

POPULAR CATEGORY

ABOUT US

FOLLOW US