How to Calculate Cosine Similarity in Python?

28 July 2024

2

In this article, we calculate the Cosine Similarity between the two non-zero vectors. A vector is a single dimesingle-dimensional signal NumPy array. Cosine similarity is a measure of similarity, often used to measure document similarity in text analysis. We use the below formula to compute the cosine similarity.

Similarity = (A.B) / (||A||.||B||)

where A and B are vectors:

A.B is dot product of A and B: It is computed as sum of element-wise product of A and B.
||A|| is L2 norm of A: It is computed as square root of the sum of squares of elements of the vector A.

Example 1:

In the example below we compute the cosine similarity between the two vectors (1-d NumPy arrays). To define a vector here we can also use the Python Lists.

Python

# import required libraries
import numpy as np
from numpy.linalg import norm
 
# define two lists or array
A = np.array([2,1,2,3,2,9])
B = np.array([3,4,2,4,5,5])
 
print("A:", A)
print("B:", B)
 
# compute cosine similarity
cosine = np.dot(A,B)/(norm(A)*norm(B))
print("Cosine Similarity:", cosine)

Output:

Example 2:

In the below example we compute the cosine similarity between a batch of three vectors (2D NumPy array) and a vector(1-D NumPy array).

Python

# import required libraries
import numpy as np
from numpy.linalg import norm
 
# define two lists or array
A = np.array([[2,1,2],[3,2,9], [-1,2,-3]])
B = np.array([3,4,2])
print("A:\n", A)
print("B:\n", B)
 
# compute cosine similarity
cosine = np.dot(A,B)/(norm(A, axis=1)*norm(B))
print("Cosine Similarity:\n", cosine)

Output:

Notice that A has three vectors and B is a single vector. In the above output, we get three elements in the cosine similarity array. The first element corresponds to the cosine similarity between the first vector (first row) of A and the second vector (B). The second element corresponds to the cosine similarity between the second vector (second row ) of A and the second vector (B). And similarly for the third element.

Example 3:

In the below example we compute the cosine similarity between the two 2-d arrays. Here each array has three vectors. Here to compute the dot product using the m of element-wise product.

Python

# import required libraries
import numpy as np
from numpy.linalg import norm
 
# define two arrays
A = np.array([[1,2,2],
               [3,2,2],
               [-2,1,-3]])
B = np.array([[4,2,4],
               [2,-2,5],
               [3,4,-4]])
 
print("A:\n", A)
print("B:\n", B)
 
# compute cosine similarity
cosine = np.sum(A*B, axis=1)/(norm(A, axis=1)*norm(B, axis=1))
 
print("Cosine Similarity:\n", cosine)
print("Cosine Similarity:\n", cosine)

Output:

The first element of the cosine similarity array is a similarity between the first rows of A and B. Similarly second element is the cosine similarity between the second rows of A and B. Similarly for the third element.

How to Calculate Cosine Similarity in Python?

Python

Python

Python

Java Program for Longest Common Subsequence

Maximum height of Tree when any Node can be considered as Root

Print Fibonacci sequence using 2 variables

LEAVE A REPLY Cancel reply

Most Popular

Interview With Bill Reed – CEO at RemotelyMe by Shauli Zacks

Samsung’s Galaxy S24 FE plummets to the price it should have been at launch

Samsung’s new periscope camera fits telephoto lenses into an even slimmer design

OnePlus’ decision to ditch Samsung’s OLED screens could backfire in the US

Recent Comments

EDITOR PICKS

Interview With Bill Reed – CEO at RemotelyMe by Shauli Zacks

Samsung’s Galaxy S24 FE plummets to the price it should have been at launch

Samsung’s new periscope camera fits telephoto lenses into an even slimmer design

POPULAR POSTS

Interview With Bill Reed – CEO at RemotelyMe by Shauli Zacks

Samsung’s Galaxy S24 FE plummets to the price it should have been at launch

Samsung’s new periscope camera fits telephoto lenses into an even slimmer design

POPULAR CATEGORY

ABOUT US

FOLLOW US