Introduction to Manifold Learning

19 June 2025

0

This article was published as a part of the Data Science Blogathon.

Introduction

This article introduces the concept of Manifold Learning. It also discusses one of its techniques- LLE (Locally Linear Embedding) and its implementation.

The curse of dimensionality

A large number of machine learning datasets involve thousands and sometimes millions of features. These features can make training very slow. In addition, there is plenty of space in high dimensions making the high-dimensional datasets very sparse, as most of the training instances are quite likely to be far from each other. This increases the risk of overfitting since the predictions will be based on much larger extrapolations as compared to those on low-dimensional data. This is called the curse of dimensionality.

There are two main approaches for dimensionality reduction: Projection and Manifold Learning. Here, we will focus on the latter.

What is Manifold Learning?

What is a manifold?

A two-dimensional manifold is any 2-D shape that can be made to fit in a higher-dimensional space by twisting or bending it, loosely speaking.

What is the Manifold Hypothesis?

“The Manifold Hypothesis states that real-world high-dimensional data lie on low-dimensional manifolds embedded within the high-dimensional space.”

In simpler terms, it means that higher-dimensional data most of the time lies on a much closer lower-dimensional manifold. The process of modeling the manifold on which training instances lie is called Manifold Learning.

Ref: Figure 2, Gu, Rui-jun, and Wenbo Xu. “An Improved Manifold Learning Algorithm for Data Visualization.” 2006 International Conference on Machine Learning and Cybernetics (2006): 1170-1173.

Locally Linear Embedding (LLE)

Locally Linear Embedding (LLE) is a Manifold Learning technique that is used for non-linear dimensionality reduction. It is an unsupervised learning algorithm that produces low-dimensional embeddings of high-dimensional inputs, relating each training instance to its closest neighbor.

How does LLE work?

For each training instance x⁽ⁱ⁾, the algorithm first finds its k nearest neighbors and then tries to express x⁽ⁱ⁾as a linear function of them. In general, if there are m training instances in total, then it tries to find the set of weights w which minimizes the squared distance between x⁽ⁱ⁾ and its linear representation.

So, the cost function is given by

where w_i,j =0, if j is not included in the k closest neighbors of i.

Also, it normalizes the weights for each training instance x⁽ⁱ⁾,

Finally, each high-dimensional training instance x⁽ⁱ⁾ is mapped to a low-dimensional (say, d dimensions) vector y⁽ⁱ⁾while preserving the neighborhood relationships. This is done by choosing d-dimensional coordinates which minimize the cost function,

Here the weights w_i,jare kept fixed while we try to find the optimum coordinates y⁽ⁱ⁾

Implementation using scikit-learn

The following code is used to implement LLE using scikit-learn’s LocallyLinearEmbedding class:

X_transformed=lle.fit_transform(X)

Now, we will compare LLE with PCA and t-SNE on the swiss roll dataset.

#Importing the required libraries
from collections import OrderedDict
from functools import partial
from time import time
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from matplotlib.ticker import NullFormatter
from sklearn import manifold, datasets
from sklearn.decomposition import PCA
#This import is needed to silence pyflakes
Axes3D
#Then we load the swiss roll dataset
n_points = 1000
X, color = datasets.make_swiss_roll(n_points, random_state=0)
n_neighbors = 10
n_components = 2
# Creating the plot
fig = plt.figure(figsize=(15, 8))
fig.suptitle("Manifold Learning with %i points, %i neighbors"
             % (1000, n_neighbors), fontsize=14)
# Adding 3d scatter plot
ax = fig.add_subplot(231, projection='3d')
ax.scatter(X[:, 0], X[:, 1], X[:, 2], c=color, cmap=plt.cm.Spectral)
ax.view_init(4, -72)
# Making a dictionary 'methods' containing LLE, t-SNE and PCA
LLE = partial(manifold.LocallyLinearEmbedding,
              n_neighbors, n_components, eigen_solver='auto')
methods = OrderedDict()
methods['LLE'] = LLE(method='standard')
methods['t-SNE'] = manifold.TSNE(n_components=n_components, init='pca',
                                 random_state=0)
methods['PCA']=PCA(n_components=2)
# Plotting the results
for i, (label, method) in enumerate(methods.items()):
    t0 = time()
    Y = method.fit_transform(X)
    t1 = time()
    print("%s: %.2g sec" % (label, t1 - t0))
    ax = fig.add_subplot(2, 3, 2 + i+(i>1))
    ax.scatter(Y[:, 0], Y[:, 1], c=color, cmap=plt.cm.Spectral)
    ax.set_title("%s (%.2g sec)" % (label, t1 - t0))
    ax.xaxis.set_major_formatter(NullFormatter())
    ax.yaxis.set_major_formatter(NullFormatter())
    ax.axis('tight')
plt.show()

Output: We get the following plot:

End Notes

The use of manifold learning is based on the assumption that our dataset or the task which we are doing will be much simpler if it is expressed in lower dimensions. But this may not always be true. So, dimensionality reduction may reduce training time but whether or not it will lead to a better solution depends on the dataset.

Conclusion

In this article, we introduced the concept of Manifold Learning and also discussed one of its techniques-LLE. Then we saw its python implementation and also compared LLE with t-SNE and PCA on the swiss roll dataset.

References:

Hope you enjoyed reading this article! 🙂

About the author

I am Isha Sharma, currently pursuing B.Tech from the Indian Institute of Technology Kharagpur. I am passionate about data science and deep learning.

The media shown in this article are not owned by Analytics Vidhya and is used at the Author’s discretion.

Introduction to Manifold Learning

Introduction

The curse of dimensionality

What is Manifold Learning?

Locally Linear Embedding (LLE)

Implementation using scikit-learn

End Notes

Conclusion

About the author

Related

How We Built a Semantic Highlighting Model for RAG Context Pruning and Token Saving

LLM Context Pruning: A Developer’s Guide to Better RAG and Agentic AI Results

Phrase Match with Slop in Milvus 2.6: How to Improve Phrase-Level Full-Text Search Accuracy

LEAVE A REPLY Cancel reply

Most Popular

How We Built a Semantic Highlighting Model for RAG Context Pruning and Token Saving

Google Clock makes it harder to accidentally dismiss alarms

Gemini app rolls out feature for faster answers

Google Voice Search just got a major upgrade

EDITOR PICKS

How We Built a Semantic Highlighting Model for RAG Context Pruning and Token Saving

Google Clock makes it harder to accidentally dismiss alarms

Gemini app rolls out feature for faster answers

POPULAR POSTS

How We Built a Semantic Highlighting Model for RAG Context Pruning and Token Saving

Google Clock makes it harder to accidentally dismiss alarms

Gemini app rolls out feature for faster answers

POPULAR CATEGORY

ABOUT US

FOLLOW US