In this article, we are going to see how to filter Pandas Dataframe based on index. We can filter Dataframe based on indexes with the help of filter(). This method is used to Subset rows or columns of the Dataframe according to labels in the specified index. We can use the below syntax to filter Dataframe based on index.
Syntax: DataFrame.filter ( items=None, like=None, regex=None, axis=None )
Parameters:
- items : List of info axis to restrict to (must not all be present).
- like : Keep info axis where “arg in col == True”
- regex : Keep info axis with re.search(regex, col) == True
- axis : The axis to filter on. By default this is the info axis, ‘index’ for Series, ‘columns’ for DataFrame
Returns : This method is return same type of object as input object.
Example 1:
The following program is to understand how to filter Dataframe based on numeric value indexes.
Python
# import pandas import pandas as pd # define data data = { "Name" : [ "Mukul" , "Suraj" , "Rohit" , "Rahul" , "Mohit" , "Nishu" , "Rishi" , "Manoj" , "Mukesh" , "Rohan" ], "Age" : [ 22 , 23 , 25 , 21 , 27 , 24 , 26 , 23 , 21 , 27 ], "Qualification" : [ "BBA" , "BCA" , "BBA" , "BBA" , "MBA" , "BCA" , "MBA" , "BBA" , "BCA" , "MBA" ] } # define dataframe df = pd.DataFrame(data, columns = [ 'Name' , 'Age' , 'Qualification' ]) # display original dataframe print ( "\n Original Dataframe \n" , df) # filter 5 index value df_1 = df. filter (items = [ 5 ], axis = 0 ) # display result print ( "\n Display only 5 index value \n" , df_1) # filter only 5 and 8 index value df_2 = df. filter (items = [ 5 , 8 ], axis = 0 ) # display result print ( "\n Display only 5 and 8 index value \n" , df_2) |
Output:
Example 2:
The following program is to know how to filter Dataframe based on non-numeric value indexes.
Python
# import pandas import pandas as pd # define data data = { "Name" : [ "Mukul" , "Suraj" , "Rohit" , "Rahul" , "Mohit" ], "Age" : [ 22 , 23 , 25 , 21 , 27 ], "Qualification" : [ "BBA" , "BCA" , "BBA" , "BBA" , "MBA" ] } # define dataframe df = pd.DataFrame(data, columns = [ 'Name' , 'Age' , 'Qualification' ], index = [ 'Person_A' , 'Person_B' , 'Person_C' , 'Person_D' , 'Person_E' ]) # display original dataframe print ( "\n Original Dataframe \n" , df) # filter Person_B index value df_1 = df. filter (items = [ 'Person_B' ], axis = 0 ) # display result print ( "\n Display only Person_B index value \n" , df_1) # filter only Person_B and Person_D index value df_2 = df. filter (items = [ 'Person_B' , 'Person_D' ], axis = 0 ) # display result print ( "\n Display Person_B and Person_D index value \n" , df_2) |
Output:
Example 3:
The following program is to understand how to filter Dataframe for indexes that contain a specific string.
Python
# import pandas import pandas as pd # define data data = { "Name" : [ "Mukul" , "Suraj" , "Rohit" , "Rahul" , "Mohit" ], "Age" : [ 22 , 23 , 25 , 21 , 27 ], "Qualification" : [ "BBA" , "BCA" , "BBA" , "BBA" , "MBA" ] } # define dataframe df = pd.DataFrame(data, columns = [ 'Name' , 'Age' , 'Qualification' ], index = [ 'Person_A' , 'Person_B' , 'Person_AB' , 'Person_c' , 'Person_AC' ]) # display original dataframe print ( "\n Original Dataframe \n" , df) # filter index that contain Person_A string. df_1 = df. filter (like = 'Person_A' , axis = 0 ) # display result print ( "\n display index that contain Person_A string \n" , df_1) |
Output
Example 4:
The following program is to know how to filter Dataframe for indexes that contain a specific character.
Python
# import pandas import pandas as pd # define data data = { "Name" : [ "Mukul" , "Suraj" , "Rohit" , "Rahul" , "Mohit" ], "Age" : [ 22 , 23 , 25 , 21 , 27 ], "Qualification" : [ "BBA" , "BCA" , "BBA" , "BBA" , "MBA" ] } # define dataframe df = pd.DataFrame(data, columns = [ 'Name' , 'Age' , 'Qualification' ], index = [ 'Person_A' , 'Person_B' , 'Person_AB' , 'Person_C' , 'Person_BC' ]) # display original dataframe print ( "\n Original Dataframe \n" , df) # filter index that contain an specific character. df_1 = df. filter (like = 'B' , axis = 0 ) # display result print ( "\n display all indexes that contain Specific character \n" , df_1) |
Output