Given a Pandas DataFrame, let’s see how to change its column names and row indexes.
About Pandas DataFrame
Pandas DataFrame are rectangular grids which are used to store data. It is easy to visualize and work with data when stored in dataFrame.
- It consists of rows and columns.
- Each row is a measurement of some instance while column is a vector which contains data for some specific attribute/variable.
- Each dataframe column has a homogeneous data throughout any specific column but dataframe rows can contain homogeneous or heterogeneous data throughout any specific row.
- Unlike two dimensional array, pandas dataframe axes are labeled.
Pandas Dataframe type has two attributes called ‘columns’ and ‘index’ which can be used to change the column names as well as the row indexes.
Create a DataFrame using dictionary.
Python3
# first import the libraries import pandas as pd # Create a dataFrame using dictionary df = pd.DataFrame({ "Name" :[ 'Tom' , 'Nick' , 'John' , 'Peter' ], "Age" :[ 15 , 26 , 17 , 28 ]}) # Creates a dataFrame with # 2 columns and 4 rows df |
Output:
Name Age
0 Tom 15
1 Nick 26
2 John 17
3 Peter 28
Method #1: Using df.columns
and df.index
Changing the column name and row index using df.columns
and df.index
attribute. In order to change the column names, we provide a Python list containing the names of the column df.columns= ['First_col', 'Second_col', 'Third_col', .....]
. In order to change the row indexes, we also provide a Python list for it df.index=['row1', 'row2', 'row3', ......]
.
Python3
# Let's rename already created dataFrame.
# Check the current column names
# using "columns" attribute.
# df.columns
# Change the column names
df.columns =['Col_1', 'Col_2']
# Change the row indexes
df.index = ['Row_1', 'Row_2', 'Row_3', 'Row_4']
# printing the data frame
print(df)
Output:
Col_1 Col_2
Row_1 Tom 15
Row_2 Nick 26
Row_3 John 17
Row_4 Peter 28
Method #2: Using rename()
function with dictionary
Let’s use the pandas rename function to change a single column.
Python3
# We can change multiple column names by
# passing a dictionary of old names and
# new names, to the rename() function.
df = df.rename({"Mod_col":"Col_1","B":"Col_2"}, axis='columns')
print(df)
Output:
Col_1 Col_2
Row_1 Tom 15
Row_2 Nick 26
Row_3 John 17
Row_4 Peter 28
Change multiple column names simultaneously
Python3
# We can change multiple column names by # passing a dictionary of old names and # new names, to the rename() function. df = df.rename({ "Mod_col" : "Col_1" , "B" : "Col_2" }, axis = 'columns' ) print (df) |
Output:
Col_1 Col_2
Row_1 Tom 15
Row_2 Nick 26
Row_3 John 17
Row_4 Peter 28
Method #3: Using Lambda Function to rename the columns
A lambda function is a small anonymous function which can take any number of arguments, but can only have one expression. Using the lambda function we can modify all of the column names at once. Let’s add ‘x’ at the end of each column name using lambda function
Python3
df = df.rename(columns=lambda x: x+'x')
# this will modify all the column names
print(df)
Output:
Col_1x Col_2x
Row_1 Tom 15
Row_2 Nick 26
Row_3 John 17
Row_4 Peter 28
Method #4: Using values
attribute to rename the columns.
We can use the values attribute directly on the column whose name we want to change.
Python3
df.columns.values[1] = 'Student_Age'
# this will modify the name of the first column
print(df)
Output:
Name Student_Age
0 Tom 15
1 Nick 26
2 John 17
3 Peter 28
Let’s change the row index using the Lambda function.
Python3
# To change the row indexes df = pd.DataFrame({ "A" :[ 'Tom' , 'Nick' , 'John' , 'Peter' ], "B" :[ 25 , 16 , 27 , 18 ]}) # this will increase the row index value by 10 for each row df = df.rename(index = lambda x: x + 10 ) print (df) |
Output:
A B
10 Tom 25
11 Nick 16
12 John 27
13 Peter 18
Now, if we want to change the row indexes and column names simultaneously, then it can be achieved using
rename()
function and passing both column and index attribute as the parameter.
Python3
df = df.rename(index = lambda x: x + 5 , columns = lambda x: x + 'x' ) # increase all the row index label by value 5 # append a value 'x' at the end of each column name. print (df) |
Output:
Ax Bx
15 Tom 25
16 Nick 16
17 John 27
18 Peter 18