A well-known Python machine learning toolkit called Scikit-learn provides a variety of machine learning tools and methods to assist programmers in creating sophisticated machine learning models. A strong framework for assessing the effectiveness of these models using a variety of metrics and scoring functions is also offered by Scikit-learn. To assess the effectiveness of their models, users might want to design their scoring function in specific circumstances. Scikit-learn makes this possible, and in this article, we’ll go over how to design and tweak your very own scoring function.
A scikit-learn function called a scorer accepts two arguments: the ground truth (actual values) and the model’s predicted values. A single score that evaluates the accuracy of the anticipated values is returned by the function. Accuracy, precision, recall, F1-score, and other predefined scoring functions are available in Scikit-learn. To assess the effectiveness of their models, users might want to develop their unique scoring system.
Custom scorer for a multi-class Regression problem
To create a custom scorer function in sci-kit-learn, we need to follow some steps:
Step 1: Create a custom function that evaluates the accuracy
create a Python function that accepts two arguments: the model’s predicted values and the ground truth (actual values). A single score that evaluates the accuracy of the anticipated values should be returned by the function.
Here I am defining the coefficient of determination (R2)
The coefficient of determination (R²) is a statistical measure that represents how well a statistical model predicts an outcome. It measures the proportion of variance in the predicted output that is explained by the independent input variable(s) in a regression model.
Here,
- RSS = Sum of Squared error also known as Residual sum of squares (RSS) measures the variation that is not explained by the regression model. It is the sum of squared differences between the predicted values and the actual target values.
- TSS = total sum of squares (TSS) represents the total variation in the dependent variable. It is the sum of squared differences between the actual values and the mean of the dependent variable
The value of R² ranges from 0 to 1, with higher values indicating a better fit. A value of 0 indicates that the regression line does not fit the data at all, while a value of 1 indicates a perfect fit.
Python3
import numpy as np def r_squared(y_true, y_pred): # Calculate the mean of the true values mean_y_true = np.mean(y_true) # Calculate the sum of squares of residuals and total sum of squares ss_res = np. sum ((y_true - y_pred) * * 2 ) ss_tot = np. sum ((y_true - mean_y_true) * * 2 ) # Calculate R² r2 = 1 - (ss_res / ss_tot) return r2 |
Step 2:Create a scorer object:
Once the scoring function has been constructed, a scorer object must be created using the sci-kit-learn make_scorer() function. The scoring function is passed as an argument to the make_scorer() function, which returns a scorer object.
Python3
from sklearn.metrics import make_scorer # Create a scorer object using the r_squared function r2_score = make_scorer(r2_squared) r2_score |
Output:
make_scorer(r2_squared)
Step 3: Implementations of the above-defined scorer object
After creating the scorer object, we can use it to access a machine learning model’s performance using the cross-validation functions for different subsets of datasets provided by scikit-learn or other model assessment tools.
Python3
from sklearn.datasets import fetch_california_housing from sklearn.ensemble import RandomForestRegressor from sklearn.model_selection import cross_val_score # Load the California Housing Price dataset X, y = fetch_california_housing(return_X_y = True ) # Create a Random Forest regression model model = RandomForestRegressor() # Evaluate the performance of the model u # sing cross-validation with the r2_squared function scores = cross_val_score(model, X, y, cv = 5 , scoring = r2_score) # Print the mean and standard deviation of the scores print (f "R2 Squared: {scores.mean():.2f} +/- {scores.std():.2f}" ) |
Output:
R2 Squared: 0.65 +/- 0.08
Custom scorer for a multi-class classification problem
Steps:
- Import the necessary libraries
- Load the iris dataset
- Define multiple metrics like accuracy_score, precision_score, recall_score, f1_score with make_scorer.
- Create a XGBClassifier model
- Evaluate the model using cross-validation and the custom scorer
- Print the mean scores for each metric
Python
from sklearn.metrics import make_scorer, accuracy_score from sklearn.metrics import precision_score, recall_score, f1_score from sklearn.model_selection import cross_validate from sklearn.datasets import load_iris from xgboost import XGBClassifier # Load the iris dataset iris = load_iris() # Define multiple metrics scoring = { 'accuracy' : make_scorer(accuracy_score), 'precision' : make_scorer(precision_score, average = 'macro' ), 'recall' : make_scorer(recall_score, average = 'macro' ), 'f1-score' : make_scorer(f1_score, average = 'macro' ) } # Create a XGBClassifier clf = XGBClassifier(n_estimators = 2 , max_depth = 3 , learning_rate = 0.1 ) # Evaluate the model using cross-validation and the custom scorer scores = cross_validate(clf, iris.data, iris.target, cv = 5 , scoring = scoring) # Print the mean scores for each metric print ( "Accuracy mean score:" , scores[ 'test_accuracy' ].mean()) print ( "Precision mean score:" , scores[ 'test_precision' ].mean()) print ( "Recall mean score:" , scores[ 'test_recall' ].mean()) print ( "f1-score:" , scores[ 'test_f1-score' ].mean()) |
Output:
Accuracy mean score: 0.9666666666666668 Precision mean score: 0.9707070707070707 Recall mean score: 0.9666666666666668 f1-score: 0.9664818612187034