Skewness is a statistical term and it is a way to estimate or measure the shape of a distribution. It is an important statistical methodology that is used to estimate the asymmetrical behavior rather than computing frequency distribution. Skewness can be two types:
- Symmetrical: A distribution can be called symmetric if it appears the same from the left and right from the center point.
- Asymmetrical: A distribution can be called asymmetric if it doesn’t appear the same from the left and right from the center point.
Distribution on the basis of skewness value:
- Skewness = 0: Then normally distributed.
- Skewness > 0: Then more weight in the left tail of the distribution.
- Skewness < 0: Then more weight in the right tail of the distribution.
Kurtosis:
It is also a statistical term and an important characteristic of frequency distribution. It determines whether a distribution is heavy-tailed in respect of the normal distribution. It provides information about the shape of a frequency distribution.
- kurtosis for normal distribution is equal to 3.
- For a distribution having kurtosis < 3: It is called playkurtic.
- For a distribution having kurtosis > 3, It is called leptokurtic and it signifies that it tries to produce more outliers rather than the normal distribution.
This article focuses on how to Calculate Skewness & Kurtosis in Python.
How to Calculate Skewness & Kurtosis in Python?
Calculating Skewness and Kurtosis is a step-by-step process. The steps are discussed below.
Step 1: Importing SciPy library.
SciPy is an open-source scientific library. It provides inbuilt functions to calculate Skewness and Kurtosis. We can import this library by using the below code.
Python3
# Importing scipy import scipy |
Step 2: Create a dataset.
Before calculating Skewness and Kurtosis we need to create a dataset.
Python3
# Creating a dataset dataset = [ 10 , 25 , 14 , 26 , 35 , 45 , 67 , 90 , 40 , 50 , 60 , 10 , 16 , 18 , 20 ] |
Step 3: Computing skewness of the dataset.
We can calculate the skewness of the dataset by using the inbuilt skew() function. Its syntax is given below,
Syntax:
scipy.stats.skew(array, axis=0, bias=True)
Parameters:
- array: It represents the input array (or object) containing elements.
- axis: It signifies the axis along which we want to find the skewness value (By default axis = 0).
- bias = False: Calculations are corrected to statistical bias.
Return Type:
Skewness value of the data set, along the axis.
Example:
Python3
# Importing library from scipy.stats import skew # Creating a dataset dataset = [ 88 , 85 , 82 , 97 , 67 , 77 , 74 , 86 , 81 , 95 , 77 , 88 , 85 , 76 , 81 ] # Calculate the skewness print (skew(dataset, axis = 0 , bias = True )) |
Output:
It signifies that the distribution is positively skewed
Step 4: Computing kurtosis of the dataset.
We can calculate the kurtosis of the dataset by using the inbuilt kurtosis() function. Its syntax is given below,
Syntax:
scipy.stats.kurtosis(array, axis=0, fisher=True, bias=True)
Parameters:
- array: Input array or object having the elements.
- axis: It represents the axis along which the kurtosis value is to be measured. By default axis = 0.
- fisher = True: The fisher’s definition will be used (normal 0.0).
- fisher = False: The Pearson’s definition will be used (normal 3.0).
- Bias = True: Calculations are corrected for statistical bias, if set to False.
Return Type:
Kurtosis value of the normal distribution for the data set.
Example:
Python3
# Importing library from scipy.stats import kurtosis # Creating a dataset dataset = [ 88 , 85 , 82 , 97 , 67 , 77 , 74 , 86 , 81 , 95 , 77 , 88 , 85 , 76 , 81 ] # Calculate the kurtosis print (kurtosis(dataset, axis = 0 , bias = True )) |
Output:
It signifies that the distribution has more values in the tails compared to a normal distribution.