NumPy is a Python library used for numerical computing. It offers robust multidimensional arrays as a Python object along with a variety of mathematical functions. In this article, we will go through all the essential NumPy functions used in the descriptive analysis of an array. Let’s start by initializing a sample array for our analysis.
The following code initializes a NumPy array:
Python3
import numpy as np # sample array arr = np.array([ 4 , 5 , 8 , 5 , 6 , 4 , 9 , 2 , 4 , 3 , 6 ]) print (arr) |
Output:
[4 5 8 5 6 4 9 2 4 3 6]
In order to describe our NumPy array, we need to find two types of statistics:
- Measures of central tendency.
- Measures of dispersion.
Measures of central tendency
The following methods are used to find measures of central tendency in NumPy:
- mean()- takes a NumPy array as an argument and returns the arithmetic mean of the data.
np.mean(arr)
- median()- takes a NumPy array as an argument and returns the median of the data.
np.median(arr)
The following example illustrates the usage of the mean() and median() methods.
Example:
Python3
import numpy as np arr = np.array([ 4 , 5 , 8 , 5 , 6 , 4 , 9 , 2 , 4 , 3 , 6 ]) # measures of central tendency mean = np.mean(arr) median = np.median(arr) print ( "Array =" , arr) print ( "Mean =" , mean) print ( "Median =" , median) |
Output:
Array = [4 5 8 5 6 4 9 2 4 3 6] Mean = 5.09090909091 Median = 5.0
Measures of dispersion
The following methods are used to find measures of dispersion in NumPy:
- amin()- it takes a NumPy array as an argument and returns the minimum.
np.amin(arr)
- amax()- it takes a NumPy array as an argument and returns maximum.
np.amax(arr)
- ptp()- it takes a NumPy array as an argument and returns the range of the data.
np.ptp(arr)
- var()- it takes a NumPy array as an argument and returns the variance of the data.
np.var(arr)
- std()- it takes a NumPy array as an argument and returns the standard variation of the data.
np.std(arr)
Example: The following code illustrates amin(), amax(), ptp(), var() and std() methods.
Python3
import numpy as np arr = np.array([ 4 , 5 , 8 , 5 , 6 , 4 , 9 , 2 , 4 , 3 , 6 ]) # measures of dispersion min = np.amin(arr) max = np.amax(arr) range = np.ptp(arr) variance = np.var(arr) sd = np.std(arr) print ( "Array =" , arr) print ( "Measures of Dispersion" ) print ( "Minimum =" , min ) print ( "Maximum =" , max ) print ( "Range =" , range ) print ( "Variance =" , variance) print ( "Standard Deviation =" , sd) |
Output:
Array = [4 5 8 5 6 4 9 2 4 3 6] Measures of Dispersion Minimum = 2 Maximum = 9 Range = 7 Variance = 3.90082644628 Standard Deviation = 1.9750509984
Example: Now we can combine the above-mentioned examples to get a complete descriptive analysis of our array.
Python3
import numpy as np arr = np.array([ 4 , 5 , 8 , 5 , 6 , 4 , 9 , 2 , 4 , 3 , 6 ]) # measures of central tendency mean = np.mean(arr) median = np.median(arr) # measures of dispersion min = np.amin(arr) max = np.amax(arr) range = np.ptp(arr) variance = np.var(arr) sd = np.std(arr) print ( "Descriptive analysis" ) print ( "Array =" , arr) print ( "Measures of Central Tendency" ) print ( "Mean =" , mean) print ( "Median =" , median) print ( "Measures of Dispersion" ) print ( "Minimum =" , min ) print ( "Maximum =" , max ) print ( "Range =" , range ) print ( "Variance =" , variance) print ( "Standard Deviation =" , sd) |
Output:
Descriptive analysis Array = [4 5 8 5 6 4 9 2 4 3 6] Measurements of Central Tendency Mean = 5.09090909091 Median = 5.0 Minimum = 2 Maximum = 9 Range = 7 Variance = 3.90082644628 Standard Deviation = 1.9750509984