Indexing in Pandas means selecting rows and columns of data from a Dataframe. It can be selecting all the rows and the particular number of columns, a particular number of rows, and all the columns or a particular number of rows and columns each. Indexing is also known as Subset selection.
Creating a Dataframe to Select Rows & Columns in Pandas
A list of tuples, say column names are: ‘Name’, ‘Age’, ‘City’, and ‘Salary’.
Python3
# import pandas import pandas as pd # List of Tuples employees = [( 'Stuti' , 28 , 'Varanasi' , 20000 ), ( 'Saumya' , 32 , 'Delhi' , 25000 ), ( 'Aaditya' , 25 , 'Mumbai' , 40000 ), ( 'Saumya' , 32 , 'Delhi' , 35000 ), ( 'Saumya' , 32 , 'Delhi' , 30000 ), ( 'Saumya' , 32 , 'Mumbai' , 20000 ), ( 'Aaditya' , 40 , 'Dehradun' , 24000 ), ( 'Seema' , 32 , 'Delhi' , 70000 ) ] # Create a DataFrame object from list df = pd.DataFrame(employees, columns = [ 'Name' , 'Age' , 'City' , 'Salary' ]) # Show the dataframe df |
Output:
Name Age City Salary
0 Stuti 28 Varanasi 20000
1 Saumya 32 Delhi 25000
2 Aaditya 25 Mumbai 40000
3 Saumya 32 Delhi 35000
4 Saumya 32 Delhi 30000
5 Saumya 32 Mumbai 20000
6 Aaditya 40 Dehradun 24000
7 Seema 32 Delhi 70000
Select Columns by Name in Pandas DataFrame using [ ]
The [ ] is used to select a column by mentioning the respective column name.
Example 1:
Select a single column.
Python3
# import pandas import pandas as pd # List of Tuples employees = [( 'Stuti' , 28 , 'Varanasi' , 20000 ), ( 'Saumya' , 32 , 'Delhi' , 25000 ), ( 'Aaditya' , 25 , 'Mumbai' , 40000 ), ( 'Saumya' , 32 , 'Delhi' , 35000 ), ( 'Saumya' , 32 , 'Delhi' , 30000 ), ( 'Saumya' , 32 , 'Mumbai' , 20000 ), ( 'Aaditya' , 40 , 'Dehradun' , 24000 ), ( 'Seema' , 32 , 'Delhi' , 70000 ) ] # Create a DataFrame object from list df = pd.DataFrame(employees, columns = [ 'Name' , 'Age' , 'City' , 'Salary' ]) # Using the operator [] # to select a column result = df[ "City" ] # Show the dataframe result |
Output:
0 Varanasi
1 Delhi
2 Mumbai
3 Delhi
4 Delhi
5 Mumbai
6 Dehradun
7 Delhi
Name: City, dtype: object
Example 2:
Select multiple columns.
Python3
# import pandas import pandas as pd # List of Tuples employees = [( 'Stuti' , 28 , 'Varanasi' , 20000 ), ( 'Saumya' , 32 , 'Delhi' , 25000 ), ( 'Aaditya' , 25 , 'Mumbai' , 40000 ), ( 'Saumya' , 32 , 'Delhi' , 35000 ), ( 'Saumya' , 32 , 'Delhi' , 30000 ), ( 'Saumya' , 32 , 'Mumbai' , 20000 ), ( 'Aaditya' , 40 , 'Dehradun' , 24000 ), ( 'Seema' , 32 , 'Delhi' , 70000 ) ] # Create a DataFrame object from list df = pd.DataFrame(employees, columns = [ 'Name' , 'Age' , 'City' , 'Salary' ]) # Using the operator [] to # select multiple columns result = df[[ "Name" , "Age" , "Salary" ]] # Show the dataframe result |
Output:
Name Age Salary
0 Stuti 28 20000
1 Saumya 32 25000
2 Aaditya 25 40000
3 Saumya 32 35000
4 Saumya 32 30000
5 Saumya 32 20000
6 Aaditya 40 24000
7 Seema 32 70000
Select Rows by Name in Pandas DataFrame using loc
The .loc[] function selects the data by labels of rows or columns. It can select a subset of rows and columns. There are many ways to use this function.
Example 1:
Select a single row.
Python3
# import pandas import pandas as pd # List of Tuples employees = [( 'Stuti' , 28 , 'Varanasi' , 20000 ), ( 'Saumya' , 32 , 'Delhi' , 25000 ), ( 'Aaditya' , 25 , 'Mumbai' , 40000 ), ( 'Saumya' , 32 , 'Delhi' , 35000 ), ( 'Saumya' , 32 , 'Delhi' , 30000 ), ( 'Saumya' , 32 , 'Mumbai' , 20000 ), ( 'Aaditya' , 40 , 'Dehradun' , 24000 ), ( 'Seema' , 32 , 'Delhi' , 70000 ) ] # Create a DataFrame object from list df = pd.DataFrame(employees, columns = [ 'Name' , 'Age' , 'City' , 'Salary' ]) # Set 'Name' column as index # on a Dataframe df.set_index( "Name" , inplace = True ) # Using the operator .loc[] # to select single row result = df.loc[ "Stuti" ] # Show the dataframe result |
Output:
Age 28
City Varanasi
Salary 20000
Name: Stuti, dtype: object
Example 2:
Select multiple rows.
Python3
# import pandas import pandas as pd # List of Tuples employees = [( 'Stuti' , 28 , 'Varanasi' , 20000 ), ( 'Saumya' , 32 , 'Delhi' , 25000 ), ( 'Aaditya' , 25 , 'Mumbai' , 40000 ), ( 'Saumya' , 32 , 'Delhi' , 35000 ), ( 'Saumya' , 32 , 'Delhi' , 30000 ), ( 'Saumya' , 32 , 'Mumbai' , 20000 ), ( 'Aaditya' , 40 , 'Dehradun' , 24000 ), ( 'Seema' , 32 , 'Delhi' , 70000 ) ] # Create a DataFrame object from list df = pd.DataFrame(employees, columns = [ 'Name' , 'Age' , 'City' , 'Salary' ]) # Set index on a Dataframe df.set_index( "Name" , inplace = True ) # Using the operator .loc[] # to select multiple rows result = df.loc[[ "Stuti" , "Seema" ]] # Show the dataframe result |
Output:
Age City Salary
Name
Stuti 28 Varanasi 20000
Seema 32 Delhi 70000
Example 3:
Select multiple rows and particular columns.
Syntax: Dataframe.loc[["row1", "row2"...], ["column1", "column2", "column3"...]]
Python3
# import pandas import pandas as pd # List of Tuples employees = [( 'Stuti' , 28 , 'Varanasi' , 20000 ), ( 'Saumya' , 32 , 'Delhi' , 25000 ), ( 'Aaditya' , 25 , 'Mumbai' , 40000 ), ( 'Saumya' , 32 , 'Delhi' , 35000 ), ( 'Saumya' , 32 , 'Delhi' , 30000 ), ( 'Saumya' , 32 , 'Mumbai' , 20000 ), ( 'Aaditya' , 40 , 'Dehradun' , 24000 ), ( 'Seema' , 32 , 'Delhi' , 70000 ) ] # Create a DataFrame object from list df = pd.DataFrame(employees, columns = [ 'Name' , 'Age' , 'City' , 'Salary' ]) # Set 'Name' column as index # on a Dataframe df.set_index( "Name" , inplace = True ) # Using the operator .loc[] to # select multiple rows with some # particular columns result = df.loc[[ "Stuti" , "Seema" ], [ "City" , "Salary" ]] # Show the dataframe result |
Output:
City Salary
Name
Stuti Varanasi 20000
Seema Delhi 70000
Example 4:
Select all the rows with some particular columns. We use a single colon [ : ] to select all rows and the list of columns that we want to select as given below :
Syntax: Dataframe.loc[[:, ["column1", "column2", "column3"]]
Python3
# import pandas import pandas as pd # List of Tuples employees = [( 'Stuti' , 28 , 'Varanasi' , 20000 ), ( 'Saumya' , 32 , 'Delhi' , 25000 ), ( 'Aaditya' , 25 , 'Mumbai' , 40000 ), ( 'Saumya' , 32 , 'Delhi' , 35000 ), ( 'Saumya' , 32 , 'Delhi' , 30000 ), ( 'Saumya' , 32 , 'Mumbai' , 20000 ), ( 'Aaditya' , 40 , 'Dehradun' , 24000 ), ( 'Seema' , 32 , 'Delhi' , 70000 ) ] # Creating a DataFrame object from list df = pd.DataFrame(employees, columns = [ 'Name' , 'Age' , 'City' , 'Salary' ]) # Set 'Name' column as index # on a Dataframe df.set_index( "Name" , inplace = True ) # Using the operator .loc[] to # select all the rows with # some particular columns result = df.loc[:, [ "City" , "Salary" ]] # Show the dataframe result |
Output:
City Salary
Name
Stuti Varanasi 20000
Saumya Delhi 25000
Aaditya Mumbai 40000
Saumya Delhi 35000
Saumya Delhi 30000
Saumya Mumbai 20000
Aaditya Dehradun 24000
Seema Delhi 70000
Select Rows by Index in Pandas DataFrame using iloc
The iloc[ ] is used for selection based on position. It is similar to loc[] indexer but it takes only integer values to make selections.
Example 1:
select a single row.
Python3
# import pandas import pandas as pd # List of Tuples employees = [( 'Stuti' , 28 , 'Varanasi' , 20000 ), ( 'Saumya' , 32 , 'Delhi' , 25000 ), ( 'Aaditya' , 25 , 'Mumbai' , 40000 ), ( 'Saumya' , 32 , 'Delhi' , 35000 ), ( 'Saumya' , 32 , 'Delhi' , 30000 ), ( 'Saumya' , 32 , 'Mumbai' , 20000 ), ( 'Aaditya' , 40 , 'Dehradun' , 24000 ), ( 'Seema' , 32 , 'Delhi' , 70000 ) ] # Create a DataFrame object from list df = pd.DataFrame(employees, columns = [ 'Name' , 'Age' , 'City' , 'Salary' ]) # Using the operator .iloc[] # to select single row result = df.iloc[ 2 ] # Show the dataframe result # import pandas import pandas as pd # List of Tuples employees = [( 'Stuti' , 28 , 'Varanasi' , 20000 ), ( 'Saumya' , 32 , 'Delhi' , 25000 ), ( 'Aaditya' , 25 , 'Mumbai' , 40000 ), ( 'Saumya' , 32 , 'Delhi' , 35000 ), ( 'Saumya' , 32 , 'Delhi' , 30000 ), ( 'Saumya' , 32 , 'Mumbai' , 20000 ), ( 'Aaditya' , 40 , 'Dehradun' , 24000 ), ( 'Seema' , 32 , 'Delhi' , 70000 ) ] # Create a DataFrame object from list df = pd.DataFrame(employees, columns = [ 'Name' , 'Age' , 'City' , 'Salary' ]) # Using the operator .iloc[] # to select single row result = df.iloc[ 2 ] # Show the dataframe result |
Output:
Name Aaditya
Age 25
City Mumbai
Salary 40000
Name: 2, dtype: object
Example 2:
Select multiple rows.
Python3
# import pandas import pandas as pd # List of Tuples employees = [( 'Stuti' , 28 , 'Varanasi' , 20000 ), ( 'Saumya' , 32 , 'Delhi' , 25000 ), ( 'Aaditya' , 25 , 'Mumbai' , 40000 ), ( 'Saumya' , 32 , 'Delhi' , 35000 ), ( 'Saumya' , 32 , 'Delhi' , 30000 ), ( 'Saumya' , 32 , 'Mumbai' , 20000 ), ( 'Aaditya' , 40 , 'Dehradun' , 24000 ), ( 'Seema' , 32 , 'Delhi' , 70000 ) ] # Create a DataFrame object from list df = pd.DataFrame(employees, columns = [ 'Name' , 'Age' , 'City' , 'Salary' ]) # Using the operator .iloc[] # to select multiple rows result = df.iloc[[ 2 , 3 , 5 ]] # Show the dataframe result |
Output:
Name Age City Salary
2 Aaditya 25 Mumbai 40000
3 Saumya 32 Delhi 35000
5 Saumya 32 Mumbai 20000
Example 3:
Select multiple rows with some particular columns.
Python3
# import pandas import pandas as pd # List of Tuples employees = [( 'Stuti' , 28 , 'Varanasi' , 20000 ), ( 'Saumya' , 32 , 'Delhi' , 25000 ), ( 'Aaditya' , 25 , 'Mumbai' , 40000 ), ( 'Saumya' , 32 , 'Delhi' , 35000 ), ( 'Saumya' , 32 , 'Delhi' , 30000 ), ( 'Saumya' , 32 , 'Mumbai' , 20000 ), ( 'Aaditya' , 40 , 'Dehradun' , 24000 ), ( 'Seema' , 32 , 'Delhi' , 70000 ) ] # Creating a DataFrame object from list df = pd.DataFrame(employees, columns = [ 'Name' , 'Age' , 'City' , 'Salary' ]) # Using the operator .iloc[] # to select multiple rows with # some particular columns result = df.iloc[[ 2 , 3 , 5 ], [ 0 , 1 ]] # Show the dataframe result |
Output:
Name Age
2 Aaditya 25
3 Saumya 32
5 Saumya 32
Example 4:
Select all the rows with some particular columns.
Python3
# import pandas import pandas as pd # List of Tuples employees = [( 'Stuti' , 28 , 'Varanasi' , 20000 ), ( 'Saumya' , 32 , 'Delhi' , 25000 ), ( 'Aaditya' , 25 , 'Mumbai' , 40000 ), ( 'Saumya' , 32 , 'Delhi' , 35000 ), ( 'Saumya' , 32 , 'Delhi' , 30000 ), ( 'Saumya' , 32 , 'Mumbai' , 20000 ), ( 'Aaditya' , 40 , 'Dehradun' , 24000 ), ( 'Seema' , 32 , 'Delhi' , 70000 ) ] # Create a DataFrame object from list df = pd.DataFrame(employees, columns = [ 'Name' , 'Age' , 'City' , 'Salary' ]) # Using the operator .iloc[] # to select all the rows with # some particular columns result = df.iloc[:, [ 0 , 1 ]] # Show the dataframe result |
Output:
Name Age
0 Stuti 28
1 Saumya 32
2 Aaditya 25
3 Saumya 32
4 Saumya 32
5 Saumya 32
6 Aaditya 40
7 Seema 32