How to rank Python NumPy arrays with ties?

26 July 2024

0

In this article, we are going to see how to rank Numpy arrays with ties-breakers in Python.

The ranking is an essential statistical operation used in numerous fields like data science, sociology, etc. A very brute-force approach would be to sort the indices of the array in order of their corresponding values. Such an approach would be handy in cases that don’t involve the same values in the given set of numbers. This article will take it one step ahead and explore the rankdata() function from the Python library Scipy and illustrate its usage for lists that have ties.

rankdata() function

For computing the ranks, we’ll use the rankdata() function in scipy.stats library in Python. The function has five different tie-breaking strategies, and its syntax is as follows:

Syntax: scipy.stats.rankdata(arr, method=’average’, axis=None)

Parameters:

arr: A n-dimensional array

method: A string mentioning the tie-breaking strategy. It is of 5 types:

‘average’: The average of the ranks that would have been assigned to all the tied values is assigned to each value.

‘min’: The minimum of the ranks that would have been assigned to all the tied values is assigned to each value.

‘max’: The maximum of the ranks that would have been assigned to all the tied values is assigned to each value.

‘dense’: The rank of the next highest element is assigned the rank immediately after those assigned to the tied elements.

‘ordinal’: All values are given a distinct rank, corresponding to the order that the values occur in arr.

axis: Axis along which to perform the ranking. If None, the data array is first flattened.

Returns: An Numpy array of size equal to the size of arr, containing rank scores.

Example 1: Ranking on a 1-D Numpy Array

In this example, we’ll explore all the tie-breaking strategies over a 1-dimensional Numpy array.

Python3

import numpy as np 
from scipy.stats import rankdata 
  
arr = np.array([-20, -10, -10, -10, 10, 
                20, 20, 50, 50, 60, 60, 
                60, 60, 60]) 
  
# Normal ranking; each value has distinct rank 
print(f"Ordinal ranking: {rankdata(arr, 
method='ordinal')}") 
  
# Average ranking; each value's 
# rank is averaged over all ties 
print(f"Average ranking: {rankdata(arr, 
method='average')}") 
  
# Max ranking; each value's rank is the 
# maximum ordinal rank for the corresponding 
# tie 
print(f"Max ranking: {rankdata(arr,  
method='max')}") 
  
# Min ranking; each value's rank is 
# the minimum ordinal rank for the corresponding  
# tie 
print(f"Min ranking: {rankdata(arr, 
method='min')}") 
  
# Dense ranking; each value's rank 
# is sequentially arranged 
print(f"Dense ranking: {rankdata(arr, 
method='dense')}") 

Output:

Example 2: Ranking on a 2-D Numpy Array along a particular axis using the ‘axis’ argument

In this example, we’ll explore all the tie-breaking strategies over a 2-dimensional Numpy array along the rows.

Python3

arr = np.array([[-20, -10, -10, -10, 10, 20, 20], 
                [50, 50, 60, -20, 60, 60, 60], 
                [-20, 50, -10, -30, 60, 20, 60]]) 
  
# Normal ranking; each value has distinct rank 
print(f"Ordinal ranking:\n {rankdata(arr, 
method='ordinal', axis = 0)}") 
  
# Average ranking; each value's 
# rank is averaged over all ties 
print(f"Average ranking:\n {rankdata(arr, 
method='average', axis = 0)}") 
  
# Max ranking; each value's rank is 
# the maximum ordinal rank for 
# the corresponding tie 
print(f"Max ranking:\n {rankdata(arr, 
method='max', axis = 0)}") 
  
# Min ranking; each value's rank is the  
# minimum ordinal rank for the corresponding  
# tie 
print(f"Min ranking:\n {rankdata(arr, 
method='min', axis = 0)}") 
  
# Dense ranking; each value's rank 
# is sequentially arranged 
print(f"Dense ranking:\n {rankdata(arr,  
method='dense', axis = 0)}") 

Output:

As we can see, the value for each column in the 2-D array ‘arr’ is assigned a rank by comparing the corresponding entries in the same row.

How to rank Python NumPy arrays with ties?

rankdata() function

Example 1: Ranking on a 1-D Numpy Array

Python3

Example 2: Ranking on a 2-D Numpy Array along a particular axis using the ‘axis’ argument

Python3

Java Program for Longest Common Subsequence

Maximum height of Tree when any Node can be considered as Root

Print Fibonacci sequence using 2 variables

LEAVE A REPLY Cancel reply

Most Popular

How to Protect Against Walmart Gift Card Scams in 2025 by Manual Thomas

Interview With Dan Chernov – CEO of DerScanner by Shauli Zacks

5 Best Free Antiviruses for Linux in 2025: Expert Ranked by Sam Boyd

5 Best Free Online Virus Scanners & Removers for 2025 by Kate Davidson

Recent Comments

EDITOR PICKS

How to Protect Against Walmart Gift Card Scams in 2025 by Manual Thomas

Interview With Dan Chernov – CEO of DerScanner by Shauli Zacks

5 Best Free Antiviruses for Linux in 2025: Expert Ranked by Sam Boyd

POPULAR POSTS

How to Protect Against Walmart Gift Card Scams in 2025 by Manual Thomas

Interview With Dan Chernov – CEO of DerScanner by Shauli Zacks

5 Best Free Antiviruses for Linux in 2025: Expert Ranked by Sam Boyd

POPULAR CATEGORY

ABOUT US

FOLLOW US