An entirely homogeneous clustering is one where each cluster has information that directs a place toward a similar class label. Homogeneity portrays the closeness of the clustering algorithm to this (homogeneity_score) perfection.
This metric is autonomous of the outright values of the labels. A permutation of the cluster label values won’t change the score value in any way.
Syntax : sklearn.metrics.homogeneity_score(labels_true, labels_pred)
The Metric is not symmetric, switching label_true with label_pred will return the completeness_score.
Parameters :
- labels_true:<int array, shape = [n_samples]> : It accept the ground truth class labels to be used as a reference.
- labels_pred: <array-like of shape (n_samples,)>: It accepts the cluster labels to evaluate.
Returns:
homogeneity:<float>: Its return the score between 0.0 and 1.0 stands for perfectly homogeneous labeling.
Example1:
Python3
import pandas as pd import matplotlib.pyplot as plt from sklearn.cluster import KMeans from sklearn.metrics import homogeneity_score # Changing the location file # cd C:\Users\Dev\Desktop\Credit Card Fraud # Loading the data df = pd.read_csv( 'creditcard.csv' ) # Separating the dependent and independent variables y = df[ 'Class' ] X = df.drop( 'Class' , axis = 1 ) # Building the clustering model kmeans = KMeans(n_clusters = 2 ) # Training the clustering model kmeans.fit(X) # Storing the predicted Clustering labels labels = kmeans.predict(X) # Evaluating the performance homogeneity_score(y, labels) |
Output:
0.00496764949717645
Example 2: Perfectly homogeneous:
Python3
from sklearn.metrics.cluster import homogeneity_score # Evaluate the score hscore = homogeneity_score([ 0 , 1 , 0 , 1 ], [ 1 , 0 , 1 , 0 ]) print (hscore) |
Output:
1.0
Example 3: Non-perfect labelings that further split classes into more clusters can be perfectly homogeneous:
Python3
from sklearn.metrics.cluster import homogeneity_score # Evaluate the score hscore = homogeneity_score([ 0 , 0 , 1 , 1 ], [ 0 , 1 , 2 , 3 ]) print (hscore) |
Output:
0.9999999999999999
Example 4: Include samples from different classes don’t make for homogeneous labeling:
Python3
from sklearn.metrics.cluster import homogeneity_score # Evaluate the score hscore = homogeneity_score([ 0 , 0 , 1 , 1 ], [ 0 , 1 , 0 , 1 ]) print (hscore) |
Output:
0.0