Friday, November 15, 2024
Google search engine
HomeLanguagesChi-square distance in Python

Chi-square distance in Python

Chi-square distance calculation is a statistical method, generally measures similarity between 2 feature matrices. Such distance is generally used in many applications like similar image retrieval, image texture, feature extractions etc. The Chi-square distance of 2 arrays ‘x’ and ‘y’ with ‘n’ dimension is mathematically calculated using below formula : 
 

In this article, we will learn how to calculate Chi-square distance using Python. Below given 2 different methods for calculating Chi-square Distance. Let’s see both of them with examples. 
Method #1: Calculating Chi – square distance manually using above formula. 
 

Python3




# importing numpy library
import numpy as np
 
# Function to calculate Chi-distance
def chi2_distance(A, B):
 
    # compute the chi-squared distance using above formula
    chi = 0.5 * np.sum([((a - b) ** 2) / (a + b)
                      for (a, b) in zip(A, B)])
 
    return chi
 
# main function
if __name__== "__main__":
    a = [1, 2, 13, 5, 45, 23]
    b = [67, 90, 18, 79, 24, 98]
 
    result = chi2_distance(a, b)
    print("The Chi-square distance is :", result)


Input : a = [1, 2, 13, 5, 45, 23]
        b = [67, 90, 18, 79, 24, 98] 
Output : The Chi-square distance is : 133.55428601494035

Input : a = [91, 900, 78, 30, 602, 813]
        b = [57, 49, 36, 759, 234, 928]
Output :  The Chi-square distance is : 814.776999405035

  
 
Method #2: Using scipy.stats.chisquare() method
 

Syntax: scipy.stats.chisquare(f_obs, f_exp=None, ddof=0, axis=0) 
Parameters: 
==> f_obs : array1 
==> f_exp : array2, optional 
==> ddof(Delta degrees of freedom – adjustment for p-value) : int, optional 
==> axis : int or None, optional 
The default value of ddof and axis is 0.
Returns: 
==> chisq : float or ndarray 
==> p-value of the test : float or ndarray 
 

 

Python3




# importing scipy
from scipy.stats import chisquare
 
k = [3, 4, 6, 2, 9, 5, 2]
print(chisquare(k))


Output : 
 

Power_divergenceResult(statistic=8.516129032258064, pvalue=0.20267440425509237)

 

RELATED ARTICLES

Most Popular

Recent Comments