Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric python packages. Pandas is one of those packages and makes importing and analyzing data much easier.
Pandas where()
method is used to check a data frame for one or more condition and return the result accordingly. By default, The rows not satisfying the condition are filled with NaN value.
Syntax:
DataFrame.where(cond, other=nan, inplace=False, axis=None, level=None, errors=’raise’, try_cast=False, raise_on_error=None)
Parameters:
cond: One or more condition to check data frame for.
other: Replace rows which don’t satisfy the condition with user defined object, Default is NaN
inplace: Boolean value, Makes changes in data frame itself if True
axis: axis to check( row or columns)
For link to the CSV file used, Click here.
Example #1: Single Condition operation
In this example, rows having particular Team name will be shown and rest will be replaced by NaN using .where() method.
# importing pandas package import pandas as pd # making data frame from csv file data = pd.read_csv( "nba.csv" ) # sorting dataframe data.sort_values( "Team" , inplace = True ) # making boolean series for a team name filter = data[ "Team" ] = = "Atlanta Hawks" # filtering data data.where( filter , inplace = True ) # display data |
Output:
As shown in the output image, every row which doesn’t have Team = Atlanta Hawks is replaced with NaN.
Example #2: Multi-condition Operations
Data is filtered on the basis of both Team and Age. Only the rows having Team name “Atlanta Hawks” and players having age above 24 will be displayed.
# importing pandas package import pandas as pd # making data frame from csv file data = pd.read_csv( "nba.csv" ) # sorting dataframe data.sort_values( "Team" , inplace = True ) # making boolean series for a team name filter1 = data[ "Team" ] = = "Atlanta Hawks" # making boolean series for age filter2 = data[ "Age" ]> 24 # filtering data on basis of both filters data.where(filter1 & filter2, inplace = True ) # display data |
Output:
As shown in the output image, Only the rows having Team name “Atlanta Hawks” and players having age above 24 are displayed.