ML | Chi-square Test for feature selection

27 July 2024

0

Feature selection is also known as attribute selection is a process of extracting the most relevant features from the dataset and then applying machine learning algorithms for the better performance of the model. A large number of irrelevant features increases the training time exponentially and increase the risk of overfitting.

Chi-square Test for Feature Extraction:
Chi-square test is used for categorical features in a dataset. We calculate Chi-square between each feature and the target and select the desired number of features with best Chi-square scores. It determines if the association between two categorical variables of the sample would reflect their real association in the population.
Chi- square score is given by :

where –

Observed frequency = No. of observations of class
Expected frequency = No. of expected observations of class if there was no relationship between the feature and the target.

Python Implementation of Chi-Square feature selection:

# Load libraries 
from sklearn.datasets import load_iris 
from sklearn.feature_selection import SelectKBest 
from sklearn.feature_selection import chi2 
  
# Load iris data 
iris_dataset = load_iris() 
  
# Create features and target 
X = iris_dataset.data 
y = iris_dataset.target 
  
# Convert to categorical data by converting data to integers 
X = X.astype(int) 
  
# Two features with highest chi-squared statistics are selected 
chi2_features = SelectKBest(chi2, k = 2) 
X_kbest_features = chi2_features.fit_transform(X, y) 
  
# Reduced features 
print('Original feature number:', X.shape[1]) 
print('Reduced feature number:', X_kbest.shape[1]) 

Output:

Original feature number: 4
Reduced feature number : 2

ML | Chi-square Test for feature selection

Java Program for Longest Common Subsequence

Maximum height of Tree when any Node can be considered as Root

Print Fibonacci sequence using 2 variables

LEAVE A REPLY Cancel reply

Most Popular

5 Best VPNs for YouTube in 2025: Unblock Videos + Fast by Raven Wu

How to Watch Cellcom TV From Anywhere in 2025 by Penka Hristovska

17 Best Dark Web Sites in 2025 + How to Stay Safe by Tim Mocan

Samsung Galaxy S25 edge launch could be closer than you think

Recent Comments

EDITOR PICKS

5 Best VPNs for YouTube in 2025: Unblock Videos + Fast by Raven Wu

How to Watch Cellcom TV From Anywhere in 2025 by Penka Hristovska

17 Best Dark Web Sites in 2025 + How to Stay Safe by Tim Mocan

POPULAR POSTS

5 Best VPNs for YouTube in 2025: Unblock Videos + Fast by Raven Wu

How to Watch Cellcom TV From Anywhere in 2025 by Penka Hristovska

17 Best Dark Web Sites in 2025 + How to Stay Safe by Tim Mocan

POPULAR CATEGORY

ABOUT US

FOLLOW US