Saturday, November 16, 2024
Google search engine
HomeLanguagesChange column names and row indexes in Pandas DataFrame

Change column names and row indexes in Pandas DataFrame

Given a Pandas DataFrame, let’s see how to change its column names and row indexes.

About Pandas DataFrame

Pandas DataFrame are rectangular grids which are used to store data. It is easy to visualize and work with data when stored in dataFrame.

  • It consists of rows and columns.
  • Each row is a measurement of some instance while column is a vector which contains data for some specific attribute/variable.
  • Each dataframe column has a homogeneous data throughout any specific column but dataframe rows can contain homogeneous or heterogeneous data throughout any specific row.
  • Unlike two dimensional array, pandas dataframe axes are labeled.

Pandas Dataframe type has two attributes called ‘columns’ and ‘index’ which can be used to change the column names as well as the row indexes.

Create a DataFrame using dictionary.

Python3




# first import the libraries
import pandas as pd
 
# Create a dataFrame using dictionary
df=pd.DataFrame({"Name":['Tom','Nick','John','Peter'],
                "Age":[15,26,17,28]})
 
# Creates a dataFrame with
# 2 columns and 4 rows
df


Output:

    Name  Age
0 Tom 15
1 Nick 26
2 John 17
3 Peter 28

Method #1: Using df.columns and df.index

Changing the column name and row index using df.columns and df.index attribute. In order to change the column names, we provide a Python list containing the names of the column df.columns= ['First_col', 'Second_col', 'Third_col', .....]. In order to change the row indexes, we also provide a Python list for it df.index=['row1', 'row2', 'row3', ......].

Python3

# Let's rename already created dataFrame.

# Check the current column names
# using "columns" attribute.
# df.columns

# Change the column names
df.columns =['Col_1', 'Col_2']

# Change the row indexes
df.index = ['Row_1', 'Row_2', 'Row_3', 'Row_4']

# printing the data frame
print(df)

Output:

       Col_1  Col_2
Row_1 Tom 15
Row_2 Nick 26
Row_3 John 17
Row_4 Peter 28

Method #2: Using rename() function with dictionary

Let’s use the pandas rename function to change a single column.

Python3

# We can change multiple column names by
# passing a dictionary of old names and
# new names, to the rename() function.
df = df.rename({"Mod_col":"Col_1","B":"Col_2"}, axis='columns')

print(df)

Output:

       Col_1  Col_2
Row_1 Tom 15
Row_2 Nick 26
Row_3 John 17
Row_4 Peter 28

Change multiple column names simultaneously

Python3




# We can change multiple column names by
# passing a dictionary of old names and
# new names, to the rename() function.
df = df.rename({"Mod_col":"Col_1","B":"Col_2"}, axis='columns')
 
print(df)


Output:

       Col_1  Col_2
Row_1 Tom 15
Row_2 Nick 26
Row_3 John 17
Row_4 Peter 28

Method #3: Using Lambda Function to rename the columns

A lambda function is a small anonymous function which can take any number of arguments, but can only have one expression. Using the lambda function we can modify all of the column names at once. Let’s add ‘x’ at the end of each column name using lambda function

Python3

df = df.rename(columns=lambda x: x+'x')

# this will modify all the column names
print(df)

Output:

      Col_1x  Col_2x
Row_1 Tom 15
Row_2 Nick 26
Row_3 John 17
Row_4 Peter 28

Method #4: Using values attribute to rename the columns.

We can use the values attribute directly on the column whose name we want to change.

Python3

df.columns.values[1] = 'Student_Age'

# this will modify the name of the first column
print(df)

Output:

    Name  Student_Age
0 Tom 15
1 Nick 26
2 John 17
3 Peter 28

Let’s change the row index using the Lambda function.

Python3




# To change the row indexes
df = pd.DataFrame({"A":['Tom','Nick','John','Peter'],
                "B":[25,16,27,18]})
 
# this will increase the row index value by 10 for each row
df = df.rename(index = lambda x: x + 10)
 
print(df)


Output:

        A   B
10 Tom 25
11 Nick 16
12 John 27
13 Peter 18

Now, if we want to change the row indexes and column names simultaneously, then it can be achieved using

rename()

function and passing both column and index attribute as the parameter.

Python3




df = df.rename(index = lambda x: x + 5,
            columns = lambda x: x +'x')
 
# increase all the row index label by value 5
# append a value 'x' at the end of each column name.
print(df)


Output:

       Ax  Bx
15 Tom 25
16 Nick 16
17 John 27
18 Peter 18

RELATED ARTICLES

Most Popular

Recent Comments