Thursday, July 4, 2024
HomeLanguagesPythonr-Nearest neighbors

r-Nearest neighbors

r-Nearest neighbors are a modified version of the k-nearest neighbors. The issue with k-nearest neighbors is the choice of k. With a smaller k, the classifier would be more sensitive to outliers. If the value of k is large, then the classifier would be including many points from other classes. It is from this logic that we get the r near neighbors algorithm. 

Intuition: 

Consider the following data, as the training set.

  

The green color points belong to class 0 and the red color points belong to class 1. Consider the white point P as the query point whose

 

If we take the radius of the circle as 2.2 units and if a circle is drawn using the point P as the center of the circle, the plot would be as follows

  

As the number of points in the circle belonging to class 1 (5 points) is greater than the number of points belonging to class 0 (2 points) 

Algorithm:

Step 1: Given the point P, determine the sub-set of data that lies in the ball of radius r centered at P,

Br (P) = { Xi ∊ X | dist( P, Xi ) ≤ r }

Step 2: If Br (P) is empty, then output the majority class of the entire data set. 

Step 3: If Br (P) is not empty, output the majority class of the data points in it.

Implementation of the r radius neighbors algorithm is as follows:: 

C++




// C++ program to implement the
// r nearest neighbours algorithm.
#include <bits/stdc++.h>
using namespace std;
 
struct Point
{
    // Class of point
    int val;
     
    // Co-ordinate of point
    double x, y;
};
 
// This function classifies the point p using
// r k nearest neighbour algorithm. It assumes only
// two groups and returns 0 if p belongs to class 0, else
// 1 (belongs to class 1).
int rNN(Point arr[], int n, float r, Point p)
{
    // frequency of group 0
    int freq1 = 0;
    // frequency of group 1
    int freq2 = 0;
 
    // Check if the distance is less than r
    for (int i = 0; i < n; i++)
    {
 
        if ((sqrt((arr[i].x - p.x) * (arr[i].x - p.x) +
        (arr[i].y - p.y) * (arr[i].y - p.y))) <= r)
        {
            if (arr[i].val == 0)
                freq1++;
            else if (arr[i].val == 1)
                freq2++;
        }
    }
    return (freq1 > freq2 ? 0 : 1);
}
 
// Driver code
int main()
{
    // Number of data points
    int n = 10;
    Point arr[n];
 
    arr[0].x = 1.5;
    arr[0].y = 4;
    arr[0].val = 0;
 
    arr[1].x = 1.8;
    arr[1].y = 3.8;
    arr[1].val = 0;
 
    arr[2].x = 1.65;
    arr[2].y = 5;
    arr[2].val = 0;
 
    arr[3].x = 2.5;
    arr[3].y = 3.8;
    arr[3].val = 0;
 
    arr[4].x = 3.8;
    arr[4].y = 3.8;
    arr[4].val = 0;
 
    arr[5].x = 5.5;
    arr[5].y = 3.5;
    arr[5].val = 1;
 
    arr[6].x = 5.6;
    arr[6].y = 4.5;
    arr[6].val = 1;
 
    arr[7].x = 6;
    arr[7].y = 5.4;
    arr[7].val = 1;
 
    arr[8].x = 6.2;
    arr[8].y = 4.8;
    arr[8].val = 1;
 
    arr[9].x = 6.4;
    arr[9].y = 4.4;
    arr[9].val = 1;
 
    // Query point
    Point p;
    p.x = 4.5;
    p.y = 4;
 
    // Parameter to decide the class of the query point
    float r = 2.2;
    printf("The value classified to query point"
        " is: %d.\n", rNN(arr, n, r, p));
    return 0;
}


Python3




# Python3 program to implement the
# r nearest neighbours algorithm.
import math
 
def rNN(points, p, r = 2.2):
        '''
        This function classifies the point p using
        r k nearest neighbour algorithm. It assumes only 
        two groups and returns 0 if p belongs to class 0, else
        1 (belongs to class 1).
 
        Parameters -
                points : Dictionary of training points having two
                         keys - 0 and 1. Each class have a list of
                         training data points belonging to them
 
                p : A tuple, test data point of form (x, y)
                k : radius of the r nearest neighbors
        '''
 
        freq1 = 0
        freq2 = 0
        for group in points:
                for feature in points[group]:
                        if math.sqrt((feature[0]-p[0])**2 +
                                     (feature[1]-p[1])**2) <= r:
                                if group == 0:
                                        freq1 += 1                     
                                elif group == 1:
                                        freq2 += 1
                         
        return 0 if freq1>freq2 else 1
 
# Driver function
def main():
 
        # Dictionary of training points having two keys - 0 and 1
        # key 0 have points belong to class 0
        # key 1 have points belong to class 1
 
        points = {0:[(1.5, 4), (1.8, 3.8), (1.65, 5), (2.5, 3.8), (3.8, 3.8)],
                  1:[(5.5, 3.5), (5.6, 4.5), (6, 5.4), (6.2, 4.8), (6.4, 4.4)]}
 
        # query point p(x, y)
        p = (4.5, 4)
 
        # Parameter to decide the class of the query point
        r = 2.2
 
        print("The value classified to query point is: {}".format(
                rNN(points, p, r)))
 
if __name__ == '__main__':
        main()


Output:

The value classified to query point is: 1.

Other techniques like kd-tree and locality-sensitive hashing can be used to reduce the time complexity of finding the neighbors. 

Applications: This algorithm can be used to identify outliers. If a pattern does not have any similarity with the patterns within the radius chosen, it can be identified as an outlier.

Shaida Kate Naidoo
am passionate about learning the latest technologies available to developers in either a Front End or Back End capacity. I enjoy creating applications that are well designed and responsive, in addition to being user friendly. I thrive in fast paced environments. With a diverse educational and work experience background, I excel at collaborating with teams both local and international. A versatile developer with interests in Software Development and Software Engineering. I consider myself to be adaptable and a self motivated learner. I am interested in new programming technologies, and continuous self improvement.
RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular

Recent Comments