In Pandas, an index is a label that uniquely identifies each row or column in a DataFrame. It serves as a reference or identifier for the data, allowing for easy and efficient data retrieval and manipulation. One common operation when working with tabular data is setting a specific column or row as the index. In this article, we will focus on how to set the first column and row as the index in a Pandas DataFrame.
Set the First Column and Row as Index in Pandas
Setting the First Column as the Index
Suppose we have a DataFrame with data as given below in Python
Python3
import pandas as pd data = { 'Name' : [ 'Alice' , 'Bob' , 'Charlie' ], 'Age' : [ 25 , 30 , 35 ], 'City' : [ 'New York' , 'San Francisco' , 'Los Angeles' ]} df = pd.DataFrame(data) print (df) |
Output
Name Age City
0 Alice 25 New York
1 Bob 30 San Francisco
2 Charlie 35 Los Angeles
To set the ‘Name’ column as the index, you can use the set_index method:
Python3
df.set_index( 'Name' , inplace = True ) print (df) |
Output
Age City
Name
Alice 25 New York
Bob 30 San Francisco
Charlie 35 Los Angeles
Setting the First Row as the Column Names
Suppose we have a DataFrame with data as given below.
Python3
import pandas as pd data = [[ 'Name' , 'Age' , 'City' ], [ 'Alice' , 25 , 'New York' ], [ 'Bob' , 30 , 'San Francisco' ], [ 'Charlie' , 35 , 'Los Angeles' ]] df = pd.DataFrame(data) print (df) |
Output
0 1 2
0 Name Age City
1 Alice 25 New York
2 Bob 30 San Francisco
3 Charlie 35 Los Angeles
To set the first row as the column names, we can use the following code:
Python3
df.columns = df.iloc[ 0 ] df = df[ 1 :] print (df) |
Output
Name Age City
1 Alice 25 New York
2 Bob 30 San Francisco
3 Charlie 35 Los Angeles
Setting both the row and column in a Pandas DataFrame
Suppose we have a DataFrame with data as given below.
Python3
import pandas as pd data = [[ 'Name' , 'Alice' , 'Bob' , 'Charlie' ], [ 'Age' , 25 , 30 , 35 ], [ 'City' , 'New York' , 'San Francisco' , 'Los Angeles' ]] df = pd.DataFrame(data) print (df) |
Output
0 1 2 3
0 Name Alice Bob Charlie
1 Age 25 30 35
2 City New York San Francisco Los Angeles
Now, let’s set the first row as column names and the first column as the index:
Python3
# Set the first row as column names df.columns = df.iloc[ 0 ] df = df[ 1 :] # Set the first column as the index df.set_index( 'Name' , inplace = True ) print (df) |
Output
0 Alice Bob Charlie
Name
Age 25 30 35
City New York San Francisco Los Angeles
Set Index Using a List
In this example, we first create a DataFrame named df using sample data. We then define lists row_labels and col_labels that contain the labels we want to use for rows and columns, respectively. Finally, we set the index and column names of the DataFrame using these lists.
Python3
import pandas as pd # Create a sample DataFrame data = { 'A' : [ 1 , 2 , 3 ], 'B' : [ 4 , 5 , 6 ], 'C' : [ 7 , 8 , 9 ]} df = pd.DataFrame(data) # Define a list of row labels and column labels to use as the index row_labels = [ 'Row1' , 'Row2' , 'Row3' ] col_labels = [ 'ColA' , 'ColB' , 'ColC' ] # Set the index using the lists df.index = row_labels df.columns = col_labels # Display the DataFrame with the custom index print (df) |
Output
ColA ColB ColC
Row1 1 4 7
Row2 2 5 8
Row3 3 6 9
Set Index Using a Range
In this example, we first create a DataFrame df using sample data. We then calculate the number of rows and columns in the DataFrame using the shape attribute. Next, we set the index using the range function to create a numeric range starting from 1.
Python3
import pandas as pd import numpy as np # Create a sample DataFrame data = { 'A' : [ 1 , 2 , 3 ], 'B' : [ 4 , 5 , 6 ], 'C' : [ 7 , 8 , 9 ]} df = pd.DataFrame(data) # Define the number of rows and columns num_rows = df.shape[ 0 ] num_cols = df.shape[ 1 ] # Set the index using a range of numbers # Start from 1 df.index = range ( 1 , num_rows + 1 ) # Display the DataFrame with the numeric index print (df) |
Output
A B C
1 1 4 7
2 2 5 8
3 3 6 9