Sunday, November 17, 2024
Google search engine
HomeLanguagesHow to count the number of NaN values in Pandas?

How to count the number of NaN values in Pandas?

We might need to count the number of NaN values for each feature in the dataset so that we can decide how to deal with it. For example, if the number of missing values is quite low, then we may choose to drop those observations; or there might be a column where a lot of entries are missing, so we can decide whether to include that variable at all using Python in Pandas

Count NaN values using isnull()

Pandas isnull() function detect missing values in the given series object. It returns a boolean same-sized object indicating if the values are NA. Missing values get mapped to True and non-missing value gets mapped to False. Calling the sum() method on the isnull() series returns the count of True values which actually corresponds to the number of NaN values.

Example 1: Count NaN values of Columns

We can simply find the null values in the desired column, then get the sum.

Python3




import pandas as pd
import numpy as np
 
# dictionary of lists
dict = {'A': [1, 4, 6, 9],
        'B': [np.NaN, 5, 8, np.NaN],
        'C': [7, 3, np.NaN, 2],
        'D': [1, np.NaN, np.NaN, np.NaN]}
 
# creating dataframe from the
# dictionary
data = pd.DataFrame(dict)
 
# total NaN values in column 'B'
print(data['B'].isnull().sum())


Output :

2

Example 2: Count NaN values of a row

The row can be selected using loc or iloc. Then we find the sum as before.

Python3




import pandas as pd
import numpy as np
 
# dictionary of lists
dict = {'A': [1, 4, 6, 9],
        'B': [np.NaN, 5, 8, np.NaN],
        'C': [7, 3, np.NaN, 2],
        'D': [1, np.NaN, np.NaN, np.NaN]}
 
# creating dataframe from the
# dictionary
data = pd.DataFrame(dict)
 
# total NaN values in row index 1
print(data.loc[1, :].isnull().sum())


Output :

1

Example 3: Count NaN values of entire Pandas  DataFrame

To count NaN in the entire dataset, we just need to call the sum() function twice – once for getting the count in each column and again for finding the total sum of all the columns. 

Python3




import pandas as pd
import numpy as np
 
# dictionary of lists
dict = {'A': [1, 4, 6, 9],
        'B': [np.NaN, 5, 8, np.NaN],
        'C': [7, 3, np.NaN, 2],
        'D': [1, np.NaN, np.NaN, np.NaN]}
 
# creating dataframe from the
# dictionary
data = pd.DataFrame(dict)
 
# total count of NaN values
print(data.isnull().sum().sum())


Output :

6

Count NaN values using isna()

Pandas dataframe.isna() function is used to detect missing values. It returns a boolean same-sized object indicating if the values are NA. NA values, such as None or NumPy.NaN gets mapped to True values. 

Example 1: Count NaN values of a row

We can simply find the null values in the desired row by passing the row name in df[“row_name”].

Python3




import pandas as pd
import numpy as np
 
data = { 'A':[1, 4, 6, 9],
        'B':[np.NaN, 5, 8, np.NaN],
        'C':[7, 3, np.NaN, 2],
        'D':[1, np.NaN, np.NaN, np.NaN] }
 
df = pd.DataFrame(data,columns=['A','B','C', 'D'])
 
count_nan = df['D'].isna().sum()
 
print ('Count of NaN: ' + str(count_nan))


Output :

Count of NaN: 3

Example 2:  Count NaN values of Columns

We can simply find the null values in the desired column by just passing the loc[[index_of column]].

Python3




import pandas as pd
import numpy as np
 
data = { 'A':[1, 4, 6, 9],
        'B':[np.NaN, 5, 8, np.NaN],
        'C':[7, 3, np.NaN, 2],
        'D':[1, np.NaN, np.NaN, np.NaN] }
 
df = pd.DataFrame(data,columns=['A','B','C', 'D'])
 
count_nan = df.loc[[3]].isna().sum().sum()
 
print ('Count of NaN: ' + str(count_nan))


Output :

Count of NaN: 2

Example 3:  Count NaN values of entire Pandas DataFrame

To count NaN in the entire dataset, we just need to call the isna().sum().sum() function. This sum(), is called twice – once for getting the count in each column and again for finding the total sum of all the columns. 

Python3




import pandas as pd
import numpy as np
 
data = { 'A':[1, 4, 6, 9],
        'B':[np.NaN, 5, 8, np.NaN],
        'C':[7, 3, np.NaN, 2],
        'D':[1, np.NaN, np.NaN, np.NaN] }
 
df = pd.DataFrame(data,columns=['A','B','C', 'D'])
 
count_nan = df.isna().sum().sum()
 
print ('Count of NaN: ' + str(count_nan))


Output :

Count of NaN: 6

Get Details about the Dataset

We can use the describe() method which returns a table containing details about the dataset. The count property directly gives the count of non-NaN values in each column. So, we can get the count of NaN values, if we know the total number of observations.

Python3




import pandas as pd
import numpy as np
     
# dictionary of lists
dict = { 'A':[1, 4, 6, 9],
        'B':[np.NaN, 5, 8, np.NaN],
        'C':[7, 3, np.NaN, 2],
        'D':[1, np.NaN, np.NaN, np.NaN] }
 
# creating dataframe from the
# dictionary
data = pd.DataFrame(dict)
     
data.describe()


Output:

 

RECOMMENDED ARTICLES – How to Drop Rows with NaN Values in Pandas DataFrame

RELATED ARTICLES

Most Popular

Recent Comments