How to Convert Categorical Variable to Numeric in Pandas?

28 July 2024

1

In this article, we will learn how to convert a categorical variable into a Numeric by using pandas.

When we look at the categorical data, the first question that arises to anyone is how to handle those data, because machine learning is always good at dealing with numeric values. We could make machine learning models by using text data. So, to make predictive models we have to convert categorical data into numeric form.

Method 1: Using replace() method

Replacing is one of the methods to convert categorical terms into numeric. For example, We will take a dataset of people’s salaries based on their level of education. This is an ordinal type of categorical variable. We will convert their education levels into numeric terms.

Syntax:

replace(to_replace=None, value=None, inplace=False, limit=None, regex=False, method=’pad’)

Consider the given data:

Data

Python3

#import pandas
import pandas as pd
 
# read csv file
df = pd.read_csv('data.csv')
 
# replacing values
df['Education'].replace(['Under-Graduate', 'Diploma '],
                        [0, 1], inplace=True)

Output:

In the above program, we have replaced “under-graduate” as 0 and “Diploma” as 1.

Method 2: Using get_dummies()

Replacing the values is not the most efficient way to convert them. Pandas provide a method called get_dummies which will return the dummy variable columns.

Syntax: pandas.get_dummies(data, prefix=None, prefix_sep=’_’, dummy_na=False, columns=None, sparse=False, drop_first=False, dtype=None)

Stepwise Implementation

Step 1: Create dummies columns

get_dummies() method is called and the parameter name of the column is given. This method will return the dummy variable columns. In this case, we have 3 types of Categorical variables so, it returned three columns

Step 2: Concatenate

Syntax: pandas.concat(objs, axis=0, join=’outer’, ignore_index=False, keys=None, levels=None, names=None, verify_integrity=False, sort=False, copy=True

The next step is to concatenate the dummies columns into the data frame. In pandas, there is a concat() method, which you can call to join two data frames. You should supply it with the name of two data frames and the axis. This will give you the merged data frame.

Step 3: Drop columns

We have to drop the original ‘education’ column because we have the dummy variable column and we don’t need the text column. And we might also drop one of the dummy variable columns So that we could avoid the dummy variable trap which could mess up the model. After dropping the columns, the desired dataframe is obtained

We will implement this at code

Python3

#import pandas
import pandas as pd
 
# read csv
df = pd.read_csv('salary.csv')
 
# get the dummies and store it in a variable
dummies = pd.get_dummies(df.Education)
 
# Concatenate the dummies to original dataframe
merged = pd.concat([df, dummies], axis='columns')
 
# drop the values
merged.drop(['Education', 'Under-Graduate'], axis='columns')
 
# print the dataframe
print(merged)

Output:

How to Convert Categorical Variable to Numeric in Pandas?

Method 1: Using replace() method

Python3

Method 2: Using get_dummies()

Stepwise Implementation

Python3

Java Program for Longest Common Subsequence

Maximum height of Tree when any Node can be considered as Root

Print Fibonacci sequence using 2 variables

LEAVE A REPLY Cancel reply

Most Popular

8 Best VPNs for Apple TV in 2024: Fast & Secure by Penka Hristovska

Samsung offers free screen replacements for users still suffering green line issues

7 Best Free Antiviruses for Mac in 2024: Are They Any Good? by Katarina Glamoslija

Is Microsoft Teams Secure? Use Teams Safely in 2024 by Tyler Cross

Recent Comments

EDITOR PICKS

8 Best VPNs for Apple TV in 2024: Fast & Secure by Penka Hristovska

Samsung offers free screen replacements for users still suffering green line issues

7 Best Free Antiviruses for Mac in 2024: Are They Any Good? by Katarina Glamoslija

POPULAR POSTS

8 Best VPNs for Apple TV in 2024: Fast & Secure by Penka Hristovska

Samsung offers free screen replacements for users still suffering green line issues

7 Best Free Antiviruses for Mac in 2024: Are They Any Good? by Katarina Glamoslija

POPULAR CATEGORY

ABOUT US

FOLLOW US