Friday, December 27, 2024
Google search engine
HomeLanguagesPython | Pandas Series.factorize()

Python | Pandas Series.factorize()

Pandas series is a One-dimensional ndarray with axis labels. The labels need not be unique but must be a hashable type. The object supports both integer- and label-based indexing and provides a host of methods for performing operations involving the index.

Pandas Series.factorize() function encode the object as an enumerated type or categorical variable. This method is useful for obtaining a numeric representation of an array when all that matters is identifying distinct values.

Syntax: Series.factorize(sort=False, na_sentinel=-1)

Parameter :
sort : Sort uniques and shuffle labels to maintain the relationship.
na_sentinel : Value to mark “not found”.

Returns :
labels : ndarray
uniques : ndarray, Index, or Categorical

Example #1: Use Series.factorize() function to encode the underlying data of the given series object.




# importing pandas as pd
import pandas as pd
  
# Creating the Series
sr = pd.Series(['New York', 'Chicago', 'Toronto', None, 'Rio'])
  
# Create the Index
sr.index = ['City 1', 'City 2', 'City 3', 'City 4', 'City 5'
  
# set the index
sr.index = index_
  
# Print the series
print(sr)


Output :


Now we will use Series.factorize() function to encode the underlying data of the given series object.




# encode the values
result = sr.factorize()
  
# Print the result
print(result)


Output :

As we can see in the output, the Series.factorize() function has successfully encoded the underlying data of the given series object. Notice missing values has been assigned a code of -1.
 
Example #2 : Use Series.factorize() function to encode the underlying data of the given series object.




# importing pandas as pd
import pandas as pd
  
# Creating the Series
sr = pd.Series([80, 25, 3, 80, 24, 25])
  
# Create the Index
index_ = ['Coca Cola', 'Sprite', 'Coke', 'Fanta', 'Dew', 'ThumbsUp']
  
# set the index
sr.index = index_
  
# Print the series
print(sr)


Output :

Now we will use Series.factorize() function to encode the underlying data of the given series object.




# encode the values
result = sr.factorize()
  
# Print the result
print(result)


Output :

As we can see in the output, the Series.factorize() function has successfully encoded the underlying data of the given series object.

RELATED ARTICLES

Most Popular

Recent Comments