Sunday, September 22, 2024
Google search engine
HomeLanguagesSelect Rows & Columns by Name or Index in Pandas DataFrame using...

Select Rows & Columns by Name or Index in Pandas DataFrame using [ ], loc & iloc

Indexing in Pandas means selecting rows and columns of data from a Dataframe. It can be selecting all the rows and the particular number of columns, a particular number of rows, and all the columns or a particular number of rows and columns each. Indexing is also known as Subset selection. 

Creating a Dataframe to Select Rows & Columns in Pandas

A list of tuples, say column names are: ‘Name’, ‘Age’, ‘City’, and ‘Salary’. 

Python3




# import pandas
import pandas as pd
 
# List of Tuples
employees = [('Stuti', 28, 'Varanasi', 20000),
            ('Saumya', 32, 'Delhi', 25000),
            ('Aaditya', 25, 'Mumbai', 40000),
            ('Saumya', 32, 'Delhi', 35000),
            ('Saumya', 32, 'Delhi', 30000),
            ('Saumya', 32, 'Mumbai', 20000),
            ('Aaditya', 40, 'Dehradun', 24000),
            ('Seema', 32, 'Delhi', 70000)
            ]
 
# Create a DataFrame object from list
df = pd.DataFrame(employees,
                columns =['Name', 'Age',
                        'City', 'Salary'])
# Show the dataframe
df


Output:

    Name    Age    City    Salary
0 Stuti 28 Varanasi 20000
1 Saumya 32 Delhi 25000
2 Aaditya 25 Mumbai 40000
3 Saumya 32 Delhi 35000
4 Saumya 32 Delhi 30000
5 Saumya 32 Mumbai 20000
6 Aaditya 40 Dehradun 24000
7 Seema 32 Delhi 70000

Select Columns by Name in Pandas DataFrame using [ ]

The [ ] is used to select a column by mentioning the respective column name.

Example 1: 

Select a single column.

Python3




# import pandas
import pandas as pd
 
# List of Tuples
employees = [('Stuti', 28, 'Varanasi', 20000),
             ('Saumya', 32, 'Delhi', 25000),
             ('Aaditya', 25, 'Mumbai', 40000),
             ('Saumya', 32, 'Delhi', 35000),
             ('Saumya', 32, 'Delhi', 30000),
             ('Saumya', 32, 'Mumbai', 20000),
             ('Aaditya', 40, 'Dehradun', 24000),
             ('Seema', 32, 'Delhi', 70000)
             ]
 
# Create a DataFrame object from list
df = pd.DataFrame(employees,
                  columns=['Name', 'Age',
                           'City', 'Salary'])
 
# Using the operator []
# to select a column
result = df["City"]
 
# Show the dataframe
result


Output:

0    Varanasi
1 Delhi
2 Mumbai
3 Delhi
4 Delhi
5 Mumbai
6 Dehradun
7 Delhi
Name: City, dtype: object

Example 2: 

Select multiple columns. 

Python3




# import pandas
import pandas as pd
 
# List of Tuples
employees = [('Stuti', 28, 'Varanasi', 20000),
            ('Saumya', 32, 'Delhi', 25000),
            ('Aaditya', 25, 'Mumbai', 40000),
            ('Saumya', 32, 'Delhi', 35000),
            ('Saumya', 32, 'Delhi', 30000),
            ('Saumya', 32, 'Mumbai', 20000),
            ('Aaditya', 40, 'Dehradun', 24000),
            ('Seema', 32, 'Delhi', 70000)
            ]
 
# Create a DataFrame object from list
df = pd.DataFrame(employees,
                columns =['Name', 'Age',
                        'City', 'Salary'])
 
# Using the operator [] to
# select multiple columns
result = df[["Name", "Age", "Salary"]]
 
# Show the dataframe
result


Output:

    Name    Age    Salary
0 Stuti 28 20000
1 Saumya 32 25000
2 Aaditya 25 40000
3 Saumya 32 35000
4 Saumya 32 30000
5 Saumya 32 20000
6 Aaditya 40 24000
7 Seema 32 70000

Select Rows by Name in Pandas DataFrame using loc 

The .loc[] function selects the data by labels of rows or columns. It can select a subset of rows and columns. There are many ways to use this function. 

Example 1: 

Select a single row.

Python3




# import pandas
import pandas as pd
 
# List of Tuples
employees = [('Stuti', 28, 'Varanasi', 20000),
            ('Saumya', 32, 'Delhi', 25000),
            ('Aaditya', 25, 'Mumbai', 40000),
            ('Saumya', 32, 'Delhi', 35000),
            ('Saumya', 32, 'Delhi', 30000),
            ('Saumya', 32, 'Mumbai', 20000),
            ('Aaditya', 40, 'Dehradun', 24000),
            ('Seema', 32, 'Delhi', 70000)
            ]
 
# Create a DataFrame object from list
df = pd.DataFrame(employees,
                columns =['Name', 'Age',
                'City', 'Salary'])
 
# Set 'Name' column as index
# on a Dataframe
df.set_index("Name", inplace = True)
 
# Using the operator .loc[]
# to select single row
result = df.loc["Stuti"]
 
# Show the dataframe
result


Output:

Age             28
City Varanasi
Salary 20000
Name: Stuti, dtype: object

Example 2: 

Select multiple rows. 

Python3




# import pandas
import pandas as pd
 
# List of Tuples
employees = [('Stuti', 28, 'Varanasi', 20000),
            ('Saumya', 32, 'Delhi', 25000),
            ('Aaditya', 25, 'Mumbai', 40000),
            ('Saumya', 32, 'Delhi', 35000),
            ('Saumya', 32, 'Delhi', 30000),
            ('Saumya', 32, 'Mumbai', 20000),
            ('Aaditya', 40, 'Dehradun', 24000),
            ('Seema', 32, 'Delhi', 70000)
            ]
 
# Create a DataFrame object from list
df = pd.DataFrame(employees,
                columns =['Name', 'Age',
                'City', 'Salary'])
 
# Set index on a Dataframe
df.set_index("Name",
            inplace = True)
 
# Using the operator .loc[]
# to select multiple rows
result = df.loc[["Stuti", "Seema"]]
 
# Show the dataframe
result


Output:

    Age    City    Salary
Name
Stuti 28 Varanasi 20000
Seema 32 Delhi 70000

Example 3: 

Select multiple rows and particular columns.

Syntax:  Dataframe.loc[["row1", "row2"...], ["column1", "column2", "column3"...]]

Python3




# import pandas
import pandas as pd
 
# List of Tuples
employees = [('Stuti', 28, 'Varanasi', 20000),
            ('Saumya', 32, 'Delhi', 25000),
            ('Aaditya', 25, 'Mumbai', 40000),
            ('Saumya', 32, 'Delhi', 35000),
            ('Saumya', 32, 'Delhi', 30000),
            ('Saumya', 32, 'Mumbai', 20000),
            ('Aaditya', 40, 'Dehradun', 24000),
            ('Seema', 32, 'Delhi', 70000)
            ]
 
# Create a DataFrame object from list
df = pd.DataFrame(employees,
                columns =['Name', 'Age',
                'City', 'Salary'])
 
# Set 'Name' column as index
# on a Dataframe
df.set_index("Name", inplace = True)
 
# Using the operator .loc[] to
# select multiple rows with some
# particular columns
result = df.loc[["Stuti", "Seema"],
            ["City", "Salary"]]
 
# Show the dataframe
result


Output:

    City    Salary
Name
Stuti Varanasi 20000
Seema Delhi 70000

Example 4:

Select all the rows with some particular columns. We use a single colon [ : ] to select all rows and the list of columns that we want to select as given below :

Syntax: Dataframe.loc[[:, ["column1", "column2", "column3"]]

Python3




# import pandas
import pandas as pd
 
# List of Tuples
employees = [('Stuti', 28, 'Varanasi', 20000),
            ('Saumya', 32, 'Delhi', 25000),
            ('Aaditya', 25, 'Mumbai', 40000),
            ('Saumya', 32, 'Delhi', 35000),
            ('Saumya', 32, 'Delhi', 30000),
            ('Saumya', 32, 'Mumbai', 20000),
            ('Aaditya', 40, 'Dehradun', 24000),
            ('Seema', 32, 'Delhi', 70000)
            ]
 
# Creating a DataFrame object from list
df = pd.DataFrame(employees,
                columns =['Name', 'Age',
                'City', 'Salary'])
 
# Set 'Name' column as index
# on a Dataframe
df.set_index("Name", inplace = True)
 
# Using the operator .loc[] to
# select all the rows with
# some particular columns
result = df.loc[:, ["City", "Salary"]]
 
# Show the dataframe
result


Output:

    City    Salary
Name
Stuti Varanasi 20000
Saumya Delhi 25000
Aaditya Mumbai 40000
Saumya Delhi 35000
Saumya Delhi 30000
Saumya Mumbai 20000
Aaditya Dehradun 24000
Seema Delhi 70000

Select Rows by Index in Pandas DataFrame using iloc

The iloc[ ] is used for selection based on position. It is similar to loc[] indexer but it takes only integer values to make selections. 

Example 1:

select a single row. 

Python3




# import pandas
import pandas as pd
 
# List of Tuples
employees = [('Stuti', 28, 'Varanasi', 20000),
            ('Saumya', 32, 'Delhi', 25000),
            ('Aaditya', 25, 'Mumbai', 40000),
            ('Saumya', 32, 'Delhi', 35000),
            ('Saumya', 32, 'Delhi', 30000),
            ('Saumya', 32, 'Mumbai', 20000),
            ('Aaditya', 40, 'Dehradun', 24000),
            ('Seema', 32, 'Delhi', 70000)
            ]
 
# Create a DataFrame object from list
df = pd.DataFrame(employees,
                columns =['Name', 'Age',
                'City', 'Salary'])
 
# Using the operator .iloc[]
# to select single row
result = df.iloc[2]
 
# Show the dataframe
result
# import pandas
import pandas as pd
 
# List of Tuples
employees = [('Stuti', 28, 'Varanasi', 20000),
            ('Saumya', 32, 'Delhi', 25000),
            ('Aaditya', 25, 'Mumbai', 40000),
            ('Saumya', 32, 'Delhi', 35000),
            ('Saumya', 32, 'Delhi', 30000),
            ('Saumya', 32, 'Mumbai', 20000),
            ('Aaditya', 40, 'Dehradun', 24000),
            ('Seema', 32, 'Delhi', 70000)
            ]
 
# Create a DataFrame object from list
df = pd.DataFrame(employees,
                columns =['Name', 'Age',
                'City', 'Salary'])
 
# Using the operator .iloc[]
# to select single row
result = df.iloc[2]
 
# Show the dataframe
result


Output:

Name      Aaditya
Age 25
City Mumbai
Salary 40000
Name: 2, dtype: object

Example 2: 

Select multiple rows.

Python3




# import pandas
import pandas as pd
 
# List of Tuples
employees = [('Stuti', 28, 'Varanasi', 20000),
             ('Saumya', 32, 'Delhi', 25000),
             ('Aaditya', 25, 'Mumbai', 40000),
             ('Saumya', 32, 'Delhi', 35000),
             ('Saumya', 32, 'Delhi', 30000),
             ('Saumya', 32, 'Mumbai', 20000),
             ('Aaditya', 40, 'Dehradun', 24000),
             ('Seema', 32, 'Delhi', 70000)
             ]
 
# Create a DataFrame object from list
df = pd.DataFrame(employees,
                  columns=['Name', 'Age',
                           'City', 'Salary'])
 
# Using the operator .iloc[]
# to select multiple rows
result = df.iloc[[2, 3, 5]]
 
# Show the dataframe
result


Output:

Name    Age    City    Salary
2 Aaditya 25 Mumbai 40000
3 Saumya 32 Delhi 35000
5 Saumya 32 Mumbai 20000

Example 3: 

Select multiple rows with some particular columns.

Python3




# import pandas
import pandas as pd
 
# List of Tuples
employees = [('Stuti', 28, 'Varanasi', 20000),
            ('Saumya', 32, 'Delhi', 25000),
            ('Aaditya', 25, 'Mumbai', 40000),
            ('Saumya', 32, 'Delhi', 35000),
            ('Saumya', 32, 'Delhi', 30000),
            ('Saumya', 32, 'Mumbai', 20000),
            ('Aaditya', 40, 'Dehradun', 24000),
            ('Seema', 32, 'Delhi', 70000)
            ]
 
# Creating a DataFrame object from list
df = pd.DataFrame(employees,
                columns =['Name', 'Age',
                'City', 'Salary'])
 
# Using the operator .iloc[]
# to select multiple rows with
# some particular columns
result = df.iloc[[2, 3, 5],
                [0, 1]]
 
# Show the dataframe
result


Output:

Name    Age
2 Aaditya 25
3 Saumya 32
5 Saumya 32

Example 4: 

Select all the rows with some particular columns. 

Python3




# import pandas
import pandas as pd
 
# List of Tuples
employees = [('Stuti', 28, 'Varanasi', 20000),
            ('Saumya', 32, 'Delhi', 25000),
            ('Aaditya', 25, 'Mumbai', 40000),
            ('Saumya', 32, 'Delhi', 35000),
            ('Saumya', 32, 'Delhi', 30000),
            ('Saumya', 32, 'Mumbai', 20000),
            ('Aaditya', 40, 'Dehradun', 24000),
            ('Seema', 32, 'Delhi', 70000)
            ]
 
# Create a DataFrame object from list
df = pd.DataFrame(employees,
                columns =['Name', 'Age',
            'City', 'Salary'])
 
# Using the operator .iloc[]
# to select all the rows with
# some particular columns
result = df.iloc[:, [0, 1]]
 
# Show the dataframe
result


Output:

Name    Age
0 Stuti 28
1 Saumya 32
2 Aaditya 25
3 Saumya 32
4 Saumya 32
5 Saumya 32
6 Aaditya 40
7 Seema 32

RELATED ARTICLES

Most Popular

Recent Comments