Pandas series is a One-dimensional ndarray with axis labels. The labels need not be unique but must be a hashable type. The object supports both integer- and label-based indexing and provides a host of methods for performing operations involving the index.
Pandas Series.factorize()
function encode the object as an enumerated type or categorical variable. This method is useful for obtaining a numeric representation of an array when all that matters is identifying distinct values.
Syntax: Series.factorize(sort=False, na_sentinel=-1)
Parameter :
sort : Sort uniques and shuffle labels to maintain the relationship.
na_sentinel : Value to mark “not found”.Returns :
labels : ndarray
uniques : ndarray, Index, or Categorical
Example #1: Use Series.factorize()
function to encode the underlying data of the given series object.
# importing pandas as pd import pandas as pd # Creating the Series sr = pd.Series([ 'New York' , 'Chicago' , 'Toronto' , None , 'Rio' ]) # Create the Index sr.index = [ 'City 1' , 'City 2' , 'City 3' , 'City 4' , 'City 5' ] # set the index sr.index = index_ # Print the series print (sr) |
Output :
Now we will use Series.factorize()
function to encode the underlying data of the given series object.
# encode the values result = sr.factorize() # Print the result print (result) |
Output :
As we can see in the output, the Series.factorize()
function has successfully encoded the underlying data of the given series object. Notice missing values has been assigned a code of -1.
Example #2 : Use Series.factorize()
function to encode the underlying data of the given series object.
# importing pandas as pd import pandas as pd # Creating the Series sr = pd.Series([ 80 , 25 , 3 , 80 , 24 , 25 ]) # Create the Index index_ = [ 'Coca Cola' , 'Sprite' , 'Coke' , 'Fanta' , 'Dew' , 'ThumbsUp' ] # set the index sr.index = index_ # Print the series print (sr) |
Output :
Now we will use Series.factorize()
function to encode the underlying data of the given series object.
# encode the values result = sr.factorize() # Print the result print (result) |
Output :
As we can see in the output, the Series.factorize()
function has successfully encoded the underlying data of the given series object.