How To Convert Sklearn Dataset To Pandas Dataframe In Python

27 July 2024

1

In this article, we look at how to convert sklearn dataset to a pandas dataframe in Python.

Sklearn and pandas are python libraries that are used widely for data science and machine learning operations. Pandas is majorly focused on data processing, manipulation, cleaning, and visualization whereas sklearn library provides a vast list of tools and functions to train machine learning models.

Example 1: Convert Sklearn Dataset(iris) To Pandas Dataframe

Here we imported the iris dataset from the sklearn library. We then load this data by calling the load_iris() method and saving it in the iris_data named variable. This variable has the type sklearn.utils._bunch.Bunch. The iris_data has different attributes, namely, data, target, frame, target_names, DESCR, feature_names, filename, data_module. We will make use of the data and feature_names attribute. The data attribute returns the complete data matrix for the iris dataset. The feature_names attribute returns a list of column names to consider for the data.

Python3

# IMPORT THE PANDAS LIBRARY 
# TO USE THE DATAFRAME TOOL 
import pandas as pd 
  
# IMPORT THE IRIS DATA FROM THE  
# SKLEARN MODULE 
from sklearn.datasets import load_iris 
  
# LOAD THE IRIS DATASET BY CALLING 
# THE FUNCTION 
iris_data = load_iris() 
  
# PLACE THE IRIS DATA IN A PANDAS 
# DATAFRAME 
df = pd.DataFrame(data=iris_data.data,  
                  columns=iris_data.feature_names) 
  
# DISPLAY FIRST 5 RECORDS OF THE  
# DATAFRAME 
df.head() 

Output:

   sepal length (cm)  sepal width (cm)  petal length (cm)  petal width (cm)
0                5.1               3.5                1.4               0.2
1                4.9               3.0                1.4               0.2
2                4.7               3.2                1.3               0.2
3                4.6               3.1                1.5               0.2
4                5.0               3.6                1.4               0.2

Example 2: Convert Sklearn Dataset(diabetes) To Pandas Dataframe

In this example, we will create a function named convert_to_dataframe that will help us to convert the sklearn datasets to pandas dataframe. This function will require one parameter i.e. sk_data which is the sklearn dataset and return a pandas dataframe format of this data. We are using sklearn’s diabetes dataset in this example.

Python3

# IMPORT THE PANDAS LIBRARY 
# TO USE THE DATAFRAME TOOL 
import pandas as pd 
  
# IMPORT THE DIABETES DATA FROM THE  
# SKLEARN MODULE 
from sklearn.datasets import load_diabetes 
  
# CREATE THE `convert_to_dataframe() 
# FUNCTION 
from sklearn.utils._bunch import Bunch 
  
  
def convert_to_dataframe(sk_data: Bunch): 
    if not isinstance(sk_data, Bunch): 
        raise Exception("Not a sklearn dataset") 
    return pd.DataFrame(data=sk_data.data,  
                        columns=sk_data.feature_names) 
  
  
# LOAD THE DIABETES DATA USING  
# THE `convert_to_dataframe()` FUNCTION 
diabetes_data = convert_to_dataframe(sk_data=load_diabetes()) 
  
# DISPLAY FIRST 5 RECORDS OF THE DATAFRAME 
diabetes_data.head() 

Output:

age sex bmi bp s1 s2 s3 \

0 0.038076 0.050680 0.061696 0.021872 -0.044223 -0.034821 -0.043401

1 -0.001882 -0.044642 -0.051474 -0.026328 -0.008449 -0.019163 0.074412

2 0.085299 0.050680 0.044451 -0.005670 -0.045599 -0.034194 -0.032356

3 -0.089063 -0.044642 -0.011595 -0.036656 0.012191 0.024991 -0.036038

4 0.005383 -0.044642 -0.036385 0.021872 0.003935 0.015596 0.008142

s4 s5 s6

0 -0.002592 0.019907 -0.017646

1 -0.039493 -0.068332 -0.092204

2 -0.002592 0.002861 -0.025930

3 0.034309 0.022688 -0.009362

4 -0.002592 -0.031988 -0.046641

How To Convert Sklearn Dataset To Pandas Dataframe In Python

Example 1: Convert Sklearn Dataset(iris) To Pandas Dataframe

Python3

Example 2: Convert Sklearn Dataset(diabetes) To Pandas Dataframe

Python3

Java Program for Longest Common Subsequence

Maximum height of Tree when any Node can be considered as Root

Print Fibonacci sequence using 2 variables

LEAVE A REPLY Cancel reply

Most Popular

Interview with Peter Warren, Chair of the Cyber Security Research Institute by Shauli Zacks

How Much Does Proton VPN Cost? Pricing & Plans in 2024 by Tim Mocan

How to Unblock Movie Sites at School in 2024: Works 100% by Gjurgjica Panova

How Much Does Surfshark Cost? All Special Offers by Gjurgjica Panova

Recent Comments

EDITOR PICKS

Interview with Peter Warren, Chair of the Cyber Security Research Institute by Shauli Zacks

How Much Does Proton VPN Cost? Pricing & Plans in 2024 by Tim Mocan

How to Unblock Movie Sites at School in 2024: Works 100% by Gjurgjica Panova

POPULAR POSTS

Interview with Peter Warren, Chair of the Cyber Security Research Institute by Shauli Zacks

How Much Does Proton VPN Cost? Pricing & Plans in 2024 by Tim Mocan

How to Unblock Movie Sites at School in 2024: Works 100% by Gjurgjica Panova

POPULAR CATEGORY

ABOUT US

FOLLOW US