A dataset may contain various type of values, sometimes it consists of categorical values. So, in-order to use those categorical value for programming efficiently we create dummy variables. A dummy variable is a binary variable that indicates whether a separate categorical variable takes on a specific value.
Explanation:
As you can see three dummy variables are created for the three categorical values of the temperature attribute. We can create dummy variables in python using get_dummies() method.
Syntax: pandas.get_dummies(data, prefix=None, prefix_sep=’_’,)
Parameters:
- data= input data i.e. it includes pandas data frame. list . set . numpy arrays etc.
- prefix= Initial value
- prefix_sep= Data values separation.
Return Type: Dummy variables.
Step-by-step Approach:
- Import necessary modules
- Consider the data
- Perform operations on data to get dummies
Example 1:
Python3
# import required modulesimport pandas as pdimport numpy as np# create datasetdf = pd.DataFrame({'Temperature': ['Hot', 'Cold', 'Warm', 'Cold'], })# display datasetprint(df)# create dummy variablespd.get_dummies(df) |
Output:
Example 2:
Consider List arrays to get dummies
Python3
# import required modulesimport pandas as pdimport numpy as np# create datasets = pd.Series(list('abca'))# display datasetprint(s)# create dummy variablespd.get_dummies(s) |
Output:
Example 3:
Here is another example, to get dummy variables.
Python3
# import required modulesimport pandas as pdimport numpy as np# create datasetdf = pd.DataFrame({'A': ['hello', 'vignan', 'Lazyroar'], 'B': ['vignan', 'hello', 'hello'], 'C': [1, 2, 3]})# display datasetprint(df)# create dummy variablespd.get_dummies(df) |
Output:

