Facebook Instagram Twitter Vimeo Youtube
Sign in
  • Home
  • About
  • Team
  • Buy now!
Sign in
Welcome!Log into your account
Forgot your password?
Privacy Policy
Password recovery
Recover your password
Search
Logo
Sign in
Welcome! Log into your account
Forgot your password? Get help
Privacy Policy
Password recovery
Recover your password
A password will be e-mailed to you.
Friday, August 29, 2025
Sign in / Join
  • Contact Us
  • Our Team
Facebook
Instagram
Twitter
Vimeo
Youtube
Logo
  • Home
  • News
    • News

      North Korean Hacking Tools Leak Online, Including Advanced Linux Rootkit by Paige Henley

      28 August 2025
      News

      iiNet Cyberattack Exposes Data of 280,000 Customers by Husain Parvez

      28 August 2025
      News

      ScreenConnect Super Admins Hit by Credential Harvesting Campaign by Husain Parvez

      28 August 2025
      News

      AT&T Reaches $177 Million Settlement After Major 2024 Data Breaches by Paige Henley

      28 August 2025
      News

      US Authorities Dismantle Rapper Bot, One of the Largest DDoS-for-Hire Networks by Husain Parvez

      28 August 2025
  • Data Modelling & AI
    • AllBig dataBusiness AnalyticsData ScienceData Structure & AlgorithmDatabasesVector DatabaseDeep LearningEthical HackingGenerative AIMachine Learning
      Big data

      Stop Your AI Assistant from Writing Outdated Code with Milvus SDK Code Helper

      26 August 2025
      Big data

      A Practical Guide for Choosing the Right Vector Database for Your AI Applications

      26 August 2025
      Big data

      Why I’m Against Claude Code’s Grep-Only Retrieval? It Just Burns Too Many Tokens

      26 August 2025
      Big data

      Hands-On with VDBBench: Benchmarking Vector Databases for POCs That Match Production

      16 August 2025
    • Big data
    • Business Analytics
    • Databases
    • Data Structure & Algorithm
    • Data Science
    • Deep Learning
    • Ethical Hacking
    • Generative AI
    • Machine Learning
    • Security & Testing
  • Mobile
    • AllAndroidIOS
      Android

      Gboard settings menu is getting some much-needed spring cleaning

      29 August 2025
      Android

      Samsung Galaxy S24, Fold 6, and Flip 6 pick up second One UI 8 beta

      29 August 2025
      Android

      The world’s other best foldable is now available in new major regions

      29 August 2025
      Android

      My phone became my memory system (and it actually works)

      29 August 2025
    • Android
    • IOS
  • Languages
    • AllAjaxAngularDynamic ProgrammingGolangJavaJavascriptPhpPythonReactVue
      Languages

      Working with Titles and Heading – Python docx Module

      25 June 2025
      Languages

      Creating a Receipt Calculator using Python

      25 June 2025
      Languages

      One Liner for Python if-elif-else Statements

      25 June 2025
      Languages

      Add Years to datetime Object in Python

      25 June 2025
    • Java
    • Python
  • Guest Blogs
  • Discussion
  • Our Team
HomeData Modelling & AIBig dataConsistent Hashing Explained
Big dataGuest Blogs

Consistent Hashing Explained

Algomaster
By Algomaster
19 July 2025
0
0
Share
Facebook
Twitter
Pinterest
WhatsApp

    Consistent Hashing Explained

    Why traditional hashing doesn’t work in distributed systems

    Ashish Pratap Singh's avatar

    Ashish Pratap Singh
    Feb 18, 2025

    In a distributed system where nodes (servers) are frequently added or removed, efficiently routing requests becomes challenging.

    A common approach is to use hash the request and assign it to a server using Hash(key) mod N, where N is the number of servers.

    However, this method is highly dependent on the number of servers, and any change in N can lead to significant rehashing, causing a major redistribution of keys (requests).

    Consistent hashing addresses this issue by ensuring that only a small subset of keys need to be reassigned when nodes are added or removed.

    Popularized by Amazon’s Dynamo paper, it has now become a fundamental technique in distributed databases like DynamoDB, Cassandra and ScyllaDB.

    In this article, we’ll explore what consistent hashing is, why it’s needed, how it works, and how to implement it in code.


    If you’re finding this newsletter valuable and want to deepen your learning, consider becoming a paid subscriber.

    As a paid subscriber, you’ll receive an exclusive deep-dive article every week, access to a structured System Design Resource (100+ topics and interview questions), and other premium perks.

    Unlock Full Access


    1. The Problem with Traditional Hashing

    Imagine you’re building a high-traffic web application that serves millions of users daily. To handle the load efficiently, you distribute incoming requests across multiple backend servers using a hash-based load balancer.

    Your system consists of 5 backend servers (S0, S1, S2, S3, S4), and requests are assigned using a hash function that maps each user’s IP address to a specific server.

    The process works like this:

    1. The load balancer takes a user’s IP address (or session ID).

    2. A hash function maps the IP to one of the backend servers by taking the sum of bytes in the IP address and computing mod 5 (since we have 5 servers).

    3. The request is routed to the assigned server, ensuring that the same user is always directed to the same server for session consistency.

    Example:

    Everything Works Fine… Until You Scale

    This approach works as long as the number of servers remains constant. But what happens when you add or remove a server?

    Scenario 1: Adding a New Server (S5)

    As traffic increases, you decide to scale up by adding a new backend server (S5). Now, the hash function must be modified to use mod 6 instead of mod 5 since we have 6 servers now.

    This seemingly simple change completely disrupts the existing mapping, causing most users to be reassigned to different servers.

    This results into massive rehashing, leading to high overhead, and potential downtime.

    Scenario 2: Removing a Server (S4)

    Now, let’s say one of the servers (S4) fails or is removed. The number of servers drops to 4, forcing the hash function to switch from mod 5 to mod 4.

    Even though only one server was removed, most users are reassigned to different servers. This can cause:

    • Session Loss: Active users may be logged out or disconnected.

    • Cache invalidation: Cached data becomes irrelevant, increasing database load.

    • Severe performance degradation: The system may struggle to run efficiently.

    The Solution: Consistent Hashing

    Consistent hashing offers a more scalable and efficient solution by ensuring that only a small fraction of users are reassigned when scaling up or down.

    It performs really well when operated in dynamic environments, where the system scales up and down frequently.

    Share


    2. How Consistent Hashing Works

    Consistent hashing is a distributed hashing technique used to efficiently distribute data across multiple nodes (servers, caches, etc.).

    It uses a circular hash space (hash ring) with a large and constant hash space.

    Both nodes (servers, caches, or databases) and keys (data items) are mapped to positions on this hash ring using a hash function.

    Unlike modulo-based hashing, where changes in the number of nodes cause large-scale remapping, consistent hashing ensures that only a small fraction of keys are reassigned when a node is added or removed, making it highly scalable and efficient.

    In consistent hashing, when the number of nodes changes, only k/n keys need to be reassigned, where k is the total number of keys and n is the total number of nodes.

    2.1 Constructing the Hash Ring

    Instead of distributing keys based on Hash(key) mod N, consistent hashing places both servers and keys on a circular hash ring.

    Defining the Hash Space

    • We use a large, fixed hash space ranging from 0 to 2^32 - 1 (assuming a 32-bit hash function).

    • This creates a circular structure, where values wrap around after reaching the maximum limit.

    Placing Servers on the Ring

    • Each server (node) is assigned a position on the hash ring by computing Hash(server_id).

    • Using the above example with 5 servers (S0, S1, S2, S3, S4), the hash function distributes them at different positions around the ring.

    Mapping Keys to Servers

    • When a key is added, its position is determined by computing Hash(key).

      • Example: a user’s request is assigned a position on the ring based on the hash of their IP address: Hash(IP Address)

    • We then move clockwise around the ring until we find the next available server.

    • The key (or request) is assigned to this server for storage or retrieval.

    Note: In case a key’s hash falls directly on a node’s position, it belongs to that node.

    2.2 Adding a New Server

    Suppose we add a new server (S5) to the system.

    • The position of S5 falls between S1 and S2 in the hash ring.

    • S5 takes over all keys (requests) that fall between S1 and S5, which were previously handled by S2.

      • Example: User D’s requests which were originally assigned to S2, will now be redirected to S5.

    This demonstrates how consistent hashing efficiently redistributes keys with minimal disruption, ensuring that only a small subset of keys are reassigned when new servers are added.

    2.3 Removing a Node

    When a server, such as S4, fails or is removed from the system:

    • All keys previously assigned to S4 are reassigned to the next available server in the ring (S3).

    • Only the keys (requests) that were mapped to S4 need to move, while all other keys remain unaffected.

    This results in minimal data movement, unlike traditional hashing where removing a node would require reassigning most keys.


    3. Virtual Nodes

    In basic consistent hashing, each server is assigned a single position on the hash ring. However, this can lead to uneven data distribution, especially when:

    • The number of servers is small.

    • Some servers accidentally get clustered together, creating hot spots.

    • A server is removed, causing a sudden load shift to its immediate neighbor.

    Virtual nodes (VNodes) are a technique used in consistent hashing to improve load balancing and fault tolerance by distributing data more evenly across servers.

    How Virtual Nodes Work

    Instead of assigning one position per server, each physical server is assigned multiple positions (virtual nodes) on the hash ring.

    • Each server is hashed multiple times to different locations on the ring.

    • When a request (key) is hashed, it is assigned to the next virtual node in a clockwise direction.

    • The request is then routed to the actual server associated with the virtual node.

    Example

    Assume we have three physical servers (S1, S2, S3). Without virtual nodes, their positions might be:

    S1 → Position 10
    S2 → Position 50
    S3 → Position 90

    If S1 fails, all its keys must be reassigned to S2, which can create an overload.

    With virtual nodes, each server is hashed multiple times:

    S1-1 → Position 10
    S1-2 → Position 70
    S1-3 → Position 120
    S2-1 → Position 50
    S2-2 → Position 80
    S2-3 → Position 160
    S3-1 → Position 30
    S3-2 → Position 90
    S3-3 → Position 140

    Now, instead of just one point, S1 is represented at multiple positions, making the distribution more even.

    If S1 fails, its keys are more evenly redistributed among S2 and S3, rather than all going to S2.


    4. Code Implementation (Python)

    Code Links: Python, Java

    Explanation:

    1. Key Components

      • Hash Ring (self.ring): Stores hash values → server mappings. Uses virtual nodes (replicas) for better load balancing.

      • Sorted Keys (self.sorted_keys): Maintains a sorted list of hash values for efficient lookups.

      • Server Set (self.servers): Tracks active physical servers.

    2. Initialization (__init__)

      • Calls add_server() for each server, hashing it multiple times (based on num_replicas) to ensure even distribution.

    3. Hashing Function (_hash)

      • Uses MD5 hashing to convert strings into large integers for consistent placement on the hash ring.

    4. Adding a Server (add_server)

      • Generates multiple hash values for each server (server-0, server-1, etc.).

      • Stores these in self.ring and maintains a sorted order in self.sorted_keys for fast lookup.

    5. Removing a Server (remove_server)

      • Deletes the server’s hash values and its virtual nodes from self.ring and self.sorted_keys.

    6. Getting a Server (get_server)

      • Hashes the input key and finds the closest clockwise server using bisect.bisect(). Wraps around to the first node if necessary.


    Thank you for reading!

    If you found it valuable, hit a like ❤️ and consider subscribing for more such content every week.

    If you have any questions or suggestions, leave a comment.

    This post is public so feel free to share it.

    Share


    P.S. If you’re enjoying this newsletter and want to get even more value, consider becoming a paid subscriber.

    As a paid subscriber, you’ll receive an exclusive deep dive every Thursday, access to a structured system design resource, and other premium perks.

    Unlock Full Access

    There are group discounts, gift options, and referral bonuses available.


    Checkout my Youtube channel for more in-depth content.

    Follow me on LinkedIn and X to stay updated.

    Checkout my GitHub repositories for free interview preparation resources.

    I hope you have a lovely day!

    See you soon,

    Ashish

    Share
    Facebook
    Twitter
    Pinterest
    WhatsApp
      Previous article
      Load Balancer vs Reverse Proxy vs API Gateway
      Next article
      This smartwatch flies under the radar but delivers big and now costs just $95
      Algomaster
      Algomasterhttps://blog.algomaster.io
      RELATED ARTICLES
      Guest Blogs

      Interview With Chip Witt – Principal Security Evangelist at Radware by Shauli Zacks

      28 August 2025
      Guest Blogs

      ChatGPT Leaks: We Analyzed 1,000 Public AI Conversations—Here’s What We Found by Shipra Sanganeria

      28 August 2025
      Guest Blogs

      Interview With Itai Goldman – Co-Founder and CTO at Miggo Security by Shauli Zacks

      28 August 2025

      LEAVE A REPLY Cancel reply

      Log in to leave a comment

      Most Popular

      Gboard settings menu is getting some much-needed spring cleaning

      29 August 2025

      Samsung Galaxy S24, Fold 6, and Flip 6 pick up second One UI 8 beta

      29 August 2025

      The world’s other best foldable is now available in new major regions

      29 August 2025

      My phone became my memory system (and it actually works)

      29 August 2025
      Load more
      Algomaster
      Algomaster
      202 POSTS0 COMMENTS
      https://blog.algomaster.io
      Calisto Chipfumbu
      Calisto Chipfumbu
      6621 POSTS0 COMMENTS
      http://cchipfumbu@gmail.com
      Dominic
      Dominic
      32247 POSTS0 COMMENTS
      http://wardslaus.com
      Milvus
      Milvus
      80 POSTS0 COMMENTS
      https://milvus.io/
      Nango Kala
      Nango Kala
      6615 POSTS0 COMMENTS
      neverop
      neverop
      0 POSTS0 COMMENTS
      https://geeksforgeeks.org
      Nicole Veronica
      Nicole Veronica
      11788 POSTS0 COMMENTS
      Nokonwaba Nkukhwana
      Nokonwaba Nkukhwana
      11835 POSTS0 COMMENTS
      Safety Detectives
      Safety Detectives
      2588 POSTS0 COMMENTS
      https://www.safetydetectives.com/
      Shaida Kate Naidoo
      Shaida Kate Naidoo
      6731 POSTS0 COMMENTS
      Ted Musemwa
      Ted Musemwa
      7011 POSTS0 COMMENTS
      Thapelo Manthata
      Thapelo Manthata
      6686 POSTS0 COMMENTS
      Umr Jansen
      Umr Jansen
      6699 POSTS0 COMMENTS

      EDITOR PICKS

      Gboard settings menu is getting some much-needed spring cleaning

      29 August 2025

      Samsung Galaxy S24, Fold 6, and Flip 6 pick up second One UI 8 beta

      29 August 2025

      The world’s other best foldable is now available in new major regions

      29 August 2025

      POPULAR POSTS

      Gboard settings menu is getting some much-needed spring cleaning

      29 August 2025

      Samsung Galaxy S24, Fold 6, and Flip 6 pick up second One UI 8 beta

      29 August 2025

      The world’s other best foldable is now available in new major regions

      29 August 2025

      POPULAR CATEGORY

      • Languages45985
      • Data Modelling & AI17565
      • Java15156
      • Android13932
      • Mobile12983
      • Javascript12713
      • Guest Blogs12665
      • Data Structure & Algorithm10077
      Logo

      ABOUT US

      We provide you with the latest breaking news and videos straight from the technology industry.

      Contact us: hello@geeksforgeeks.org

      FOLLOW US

      Blogger
      Facebook
      Flickr
      Instagram
      VKontakte

      © NeverOpen 2022

      • Home
      • News
      • Data Modelling & AI
      • Mobile
      • Languages
      • Guest Blogs
      • Discussion
      • Our Team