Friday, December 27, 2024
Google search engine
HomeLanguagesSelect Columns with Specific Data Types in Pandas Dataframe

Select Columns with Specific Data Types in Pandas Dataframe

In this article, we will see how to select columns with specific data types from a dataframe. This operation can be performed using the DataFrame.select_dtypes() method in pandas module.

Syntax: DataFrame.select_dtypes(include=None, exclude=None)
Parameters : 
include, exclude : A selection of dtypes or strings to be included/excluded. At least one of these parameters must be supplied.
Return : The subset of the frame including the dtypes in include and excluding the dtypes in exclude.

Step-by-step Approach:

  • First, import modules then load the dataset.

Python3




# import required module
import pandas as pd
  
# assign dataset
df = pd.read_csv("train.csv")


  • Then we will find types of data present in our dataset using dataframe.info() method.

Python3




# display description
# of the dataset
df.info()


Output:

  • Now, we will use DataFrame.select_dtypes() to select a specific datatype.

Python3




# store columns with specific data type
integer_columns = df.select_dtypes(include=['int64']).columns
float_columns = df.select_dtypes(include=['float64']).columns
object_columns = df.select_dtypes(include=['object']).columns


  • Finally, display the column having a particular data type.

Python3




# display columns
print('\nint64 columns:\n', integer_columns)
print('\nfloat64 columns:\n', float_columns)
print('\nobject columns:\n', object_columns)


Output:

Below is the complete program based on the above approach:

Python3




# import required module
import pandas as pd
  
# assign dataset
df = pd.read_csv("train.csv")
  
# store columns with specific data type
integer_columns = df.select_dtypes(include=['int64']).columns
float_columns = df.select_dtypes(include=['float64']).columns
object_columns = df.select_dtypes(include=['object']).columns
  
# display columns
print('\nint64 columns:\n',integer_columns)
print('\nfloat64 columns:\n',float_columns)
print('\nobject columns:\n',object_columns)


Output:

Example:

Here we are going to extract columns of the below dataset:

Python3




# import required module
import pandas as pd
from vega_datasets import data
  
# assign dataset
df = data.seattle_weather()
  
# display dataset
df.sample(10)


Output:

Now, we are going to display all the columns having float64 as the data type.

Python3




# import required module
import pandas as pd
from vega_datasets import data
  
# assign dataset
df = data.seattle_weather()
  
# display description
# of dataset
df.info()
  
# store columns with specific data type
columns = df.select_dtypes(include=['float64']).columns
  
# display columns
print('\nColumns:\n', columns)


Output:

RELATED ARTICLES

Most Popular

Recent Comments