How to convert Categorical features to Numerical Features in Python?

26 July 2024

1

It’s difficult to create machine learning models that can’t have features that have categorical values, such models cannot function. categorical variables have string-type values. thus we have to convert string values to numbers. This can be accomplished by creating new features based on the categories and setting values to them. In this article, we are going to see how to convert Categorical features to Numerical Features in Python

Stepwise Implementation

Step 1: Import the necessary packages and modules

Python3

# import packages and modules 
import numpy as np 
import pandas as pd 
from sklearn import preprocessing 

Step 2: Import the CSV file

We will use the pandas read_csv() method to import the CSV file. To view and download the CSV file used click here.

Python3

# import the CSV file 
df = pd.read_csv('cluster_mpg.csv') 
print(df.head()) 

Output:

Step 3: Get all features with categorical values

We use df.info() to find categorical features. Categorical features have Dtype as “object”.

Python3

df.info()

Output:

In the given database columns “origin” and “name” is object type.

Step 4: Convert string values of origin column to numerical values

We will fit the “origin” column using preprocessing.LabelEncoder().fit() method.

Python3

label_encoder = preprocessing.LabelEncoder() 
label_encoder.fit(df["origin"]) 

Step 5: Get the unique values out of the categorical features

We will use label_encoder.classes_ attribute for this purpose.

classes_:ndarray of shape (n_classes,)

Holds the label for each class.

Python3

# finding the unique classes 
print(list(label_encoder.classes_)) 
print() 

Output

['europe', 'japan', 'usa']

Step 6: Transforming the categorical values

Python3

# values after transforming the categorical column. 
print(label_encoder.transform(df["origin"])) 

Output:

How to convert Categorical features to Numerical Features in Python?

Stepwise Implementation

Step 1: Import the necessary packages and modules

Python3

Step 2: Import the CSV file

Python3

Step 3: Get all features with categorical values

Python3

Step 4: Convert string values of origin column to numerical values

Python3

Step 5: Get the unique values out of the categorical features

Python3

Step 6: Transforming the categorical values

Python3

Java Program for Longest Common Subsequence

Maximum height of Tree when any Node can be considered as Root

Print Fibonacci sequence using 2 variables

LEAVE A REPLY Cancel reply

Most Popular

Vietnam’s Success in Software Outsourcing

Install Python 3 / Python 2.7 on Rocky Linux 8 |AlmaLinux 8

How To Manage Angular JS Projects using Angular CLI

How To Install PHP 8.2 on Ubuntu 22.04|20.04|18.04

Recent Comments

EDITOR PICKS

Vietnam’s Success in Software Outsourcing

Install Python 3 / Python 2.7 on Rocky Linux 8 |AlmaLinux 8

How To Manage Angular JS Projects using Angular CLI

POPULAR POSTS

Vietnam’s Success in Software Outsourcing

Install Python 3 / Python 2.7 on Rocky Linux 8 |AlmaLinux 8

How To Manage Angular JS Projects using Angular CLI

POPULAR CATEGORY

ABOUT US

FOLLOW US