Facebook Instagram Twitter Vimeo Youtube
Sign in
  • Home
  • About
  • Team
  • Buy now!
Sign in
Welcome!Log into your account
Forgot your password?
Privacy Policy
Password recovery
Recover your password
Search
Logo
Sign in
Welcome! Log into your account
Forgot your password? Get help
Privacy Policy
Password recovery
Recover your password
A password will be e-mailed to you.
Thursday, August 28, 2025
Sign in / Join
  • Contact Us
  • Our Team
Facebook
Instagram
Twitter
Vimeo
Youtube
Logo
  • Home
  • News
    • News

      North Korean Hacking Tools Leak Online, Including Advanced Linux Rootkit by Paige Henley

      28 August 2025
      News

      iiNet Cyberattack Exposes Data of 280,000 Customers by Husain Parvez

      28 August 2025
      News

      ScreenConnect Super Admins Hit by Credential Harvesting Campaign by Husain Parvez

      28 August 2025
      News

      AT&T Reaches $177 Million Settlement After Major 2024 Data Breaches by Paige Henley

      28 August 2025
      News

      US Authorities Dismantle Rapper Bot, One of the Largest DDoS-for-Hire Networks by Husain Parvez

      28 August 2025
  • Data Modelling & AI
    • AllBig dataBusiness AnalyticsData ScienceData Structure & AlgorithmDatabasesVector DatabaseDeep LearningEthical HackingGenerative AIMachine Learning
      Big data

      Stop Your AI Assistant from Writing Outdated Code with Milvus SDK Code Helper

      26 August 2025
      Big data

      A Practical Guide for Choosing the Right Vector Database for Your AI Applications

      26 August 2025
      Big data

      Why I’m Against Claude Code’s Grep-Only Retrieval? It Just Burns Too Many Tokens

      26 August 2025
      Big data

      Hands-On with VDBBench: Benchmarking Vector Databases for POCs That Match Production

      16 August 2025
    • Big data
    • Business Analytics
    • Databases
    • Data Structure & Algorithm
    • Data Science
    • Deep Learning
    • Ethical Hacking
    • Generative AI
    • Machine Learning
    • Security & Testing
  • Mobile
    • AllAndroidIOS
      Android

      Best free VPNs in 2025

      28 August 2025
      Android

      PS5 exclusive Ghost of Yōtei finishes development over a month before launch

      28 August 2025
      Android

      Samsung and Microsoft are bringing Copilot to your living room

      28 August 2025
      Android

      Google warns: Change your Gmail password now to stay out of harm’s way

      28 August 2025
    • Android
    • IOS
  • Languages
    • AllAjaxAngularDynamic ProgrammingGolangJavaJavascriptPhpPythonReactVue
      Languages

      Working with Titles and Heading – Python docx Module

      25 June 2025
      Languages

      Creating a Receipt Calculator using Python

      25 June 2025
      Languages

      One Liner for Python if-elif-else Statements

      25 June 2025
      Languages

      Add Years to datetime Object in Python

      25 June 2025
    • Java
    • Python
  • Guest Blogs
  • Discussion
  • Our Team
HomeData Modelling & AIBig dataDesigning a Distributed Rate Limiter
Big dataGuest Blogs

Designing a Distributed Rate Limiter

Algomaster
By Algomaster
28 June 2025
0
3
Share
Facebook
Twitter
Pinterest
WhatsApp

    Designing a Distributed Rate Limiter

    Ashish Pratap Singh's avatar

    Ashish Pratap Singh
    Jun 15, 2025
    ∙ Paid

    A rate limiter is a mechanism used to control the number of requests or operations a user, client, or system can perform within a specific time window.

    Its primary purpose is to ensure fair usage of resources, prevent abuse, and protect backend systems from being overwhelmed by sudden spikes in traffic.

    Example: If a system allows a maximum of 100 requests per minute, any request beyond that limit within the same minute would either be throttled (delayed) or rejected outright, often with an HTTP 429 Too Many Requests response.

    In this article, we will dive into the system design of a distributed rate limiter, and explore the the 5 most commonly used rate limiting algorithms with examples, pros and cons.


    1. Requirements

    Before diving into the architecture, lets outline the functional and non-functional requirements:

    1.1 Functional Requirements

    • Per-User Rate Limiting: Enforce a fixed number of requests per user or API key within a defined time window (e.g., 100 requests per minute). Excess requests should be rejected with an HTTP 429 Too Many Requests.

    • Global Enforcement: Limits must be enforced consistently across all nodes in a distributed environment. Users shouldn’t bypass limits by switching servers.

    • Multi-Window Support: Apply limits across multiple time granularities simultaneously (e.g., per second, per minute, per hour) to prevent abuse over short and long bursts.

    1.2 Non-Functional Requirements

    To be usable at scale, our distributed rate-limiter must meet several critical non-functional goals:

    • Scalability: The system should scale horizontally to handle massive request volumes and growing user counts.

    • Low Latency: Rate limit checks should be fast ideally adding no more than a few milliseconds per request.

    • High Availability: The rate-limiter should continue working even under heavy load or node failures. There should be no single point of failure.

    • Strong Consistency: All nodes should have a consistent view of each user’s request counts. This prevents a client from bypassing limits by routing requests through different servers.

    • High Throughput: The system should support a large number of operations per second and serve many concurrent clients without significant performance degradation.


    2. High-Level Architecture

    The rate limiter acts as a middleware layer between the client and the backend servers. Its job is to inspect incoming requests and enforce predefined usage limits (e.g., 100 requests per minute per user or IP).

    To apply these limits effectively, the rate limiter must track request counts for each client. These counts are often maintained across multiple time windows, such as per second, per minute, or per hour.

    Using a traditional relational database for this purpose is generally unsuitable due to:

    • High latency: Relational databases involve disk I/O, which introduces delays on every read/write.

    • Concurrency bottlenecks: Handling thousands of concurrent updates (e.g., one per incoming request) can lead to locks and race conditions.

    • Limited throughput: RDBMSs are not optimized for high-frequency, real-time counter updates.

    An in-memory data store like Redis is a far better fit for rate limiting use cases because it offers:

    • Sub-millisecond latency for both reads and writes

    • Atomic operations like INCR, INCRBY, and EXPIRE, ensuring safe concurrent updates without race conditions

    • TTL (Time-to-Live) support, allowing counters to reset automatically at the end of each time window (e.g., after 60 seconds for a per-minute limit)

    Request Lifecycle

    Here’s how the rate limiter fits into the flow of an incoming request:

    1. Client sends request to an endpoint of the application.

    2. The rate limiter middleware performs several checks:

      • Identifies the client (via IP, token, or API key)

      • Looks up the current request count in Redis (or in-memory cache)

      • Applies any tier-specific rules (e.g., free vs premium users)

    3. If the count exceeds the allowed threshold, the request is rejected with HTTP 429 Too Many Requests.

    4. If the count is within the limit, the counter is incremented and the request proceeds to the backend service.

    5. Periodically, counters expire via TTL or are reset based on window granularity.

    Many modern applications delegate rate limiting to edge components such as API gateways or reverse proxies, which can efficiently enforce limits before traffic reaches backend services. However, for this discussion, we will focus on designing a standalone rate limiter that is integrated into or called by application servers directly.


    3. Design Deep Dive

    3.1 Single-Node Rate Limiting

    For small-scale applications with low traffic and a single application server, rate limiting can be implemented entirely in-memory, without relying on external systems like Redis. This approach is lightweight, fast, and easy to set up.

    You maintain a simple hash map (dictionary) in the application process where:

    • Keys represent client identifiers (e.g., user ID, API key, or IP address)

    • Values represent request counts within the current time window

    For each incoming request:

    1. Checks if the user exists in the map

    2. If not, create a new entry with a count of 1

    3. If the user exists, increments their counter

    4. Compare the count against the defined rate limit

    5. If the count is within the limit, allow the request; otherwise, reject it

    You can also add a time-based mechanism (e.g., timestamps or TTL logic) to reset counters after each time window.

    Despite its simplicity, this approach comes with critical drawbacks that make it unsuitable for production environments at scale:

    1. Single Point of Failure (SPOF): If the server crashes, all in-memory counters are lost. After a restart, the system “forgets” users’ recent request history potentially allowing them to exceed their limits until the counters rebuild.

    2. No Horizontal Scalability: The rate limiter lives on a single node so it doesn’t scale with traffic.

    3. Unbounded Memory Growth: Without proper eviction or TTL logic, memory usage can grow unbounded over time, especially if you’re tracking many users or long-duration windows.

    Now, lets explore two common strategies to implement rate limiting in a distributed environment.

    3.2 Distributed Rate Limiting

    This post is for paid subscribers

    Already a paid subscriber? Sign in
    Share
    Facebook
    Twitter
    Pinterest
    WhatsApp
      Previous article
      10 Must-Know Database Types for System Design Interviews
      Next article
      Strong vs. Eventual Consistency
      Algomaster
      Algomasterhttps://blog.algomaster.io
      RELATED ARTICLES
      Guest Blogs

      Interview With Chip Witt – Principal Security Evangelist at Radware by Shauli Zacks

      28 August 2025
      Guest Blogs

      ChatGPT Leaks: We Analyzed 1,000 Public AI Conversations—Here’s What We Found by Shipra Sanganeria

      28 August 2025
      Guest Blogs

      Interview With Itai Goldman – Co-Founder and CTO at Miggo Security by Shauli Zacks

      28 August 2025

      LEAVE A REPLY Cancel reply

      Log in to leave a comment

      Most Popular

      Best free VPNs in 2025

      28 August 2025

      PS5 exclusive Ghost of Yōtei finishes development over a month before launch

      28 August 2025

      Samsung and Microsoft are bringing Copilot to your living room

      28 August 2025

      Google warns: Change your Gmail password now to stay out of harm’s way

      28 August 2025
      Load more
      Algomaster
      Algomaster
      202 POSTS0 COMMENTS
      https://blog.algomaster.io
      Calisto Chipfumbu
      Calisto Chipfumbu
      6619 POSTS0 COMMENTS
      http://cchipfumbu@gmail.com
      Dominic
      Dominic
      32244 POSTS0 COMMENTS
      http://wardslaus.com
      Milvus
      Milvus
      80 POSTS0 COMMENTS
      https://milvus.io/
      Nango Kala
      Nango Kala
      6615 POSTS0 COMMENTS
      neverop
      neverop
      0 POSTS0 COMMENTS
      https://geeksforgeeks.org
      Nicole Veronica
      Nicole Veronica
      11787 POSTS0 COMMENTS
      Nokonwaba Nkukhwana
      Nokonwaba Nkukhwana
      11833 POSTS0 COMMENTS
      Safety Detectives
      Safety Detectives
      2588 POSTS0 COMMENTS
      https://www.safetydetectives.com/
      Shaida Kate Naidoo
      Shaida Kate Naidoo
      6729 POSTS0 COMMENTS
      Ted Musemwa
      Ted Musemwa
      7010 POSTS0 COMMENTS
      Thapelo Manthata
      Thapelo Manthata
      6684 POSTS0 COMMENTS
      Umr Jansen
      Umr Jansen
      6699 POSTS0 COMMENTS

      EDITOR PICKS

      Best free VPNs in 2025

      28 August 2025

      PS5 exclusive Ghost of Yōtei finishes development over a month before launch

      28 August 2025

      Samsung and Microsoft are bringing Copilot to your living room

      28 August 2025

      POPULAR POSTS

      Best free VPNs in 2025

      28 August 2025

      PS5 exclusive Ghost of Yōtei finishes development over a month before launch

      28 August 2025

      Samsung and Microsoft are bringing Copilot to your living room

      28 August 2025

      POPULAR CATEGORY

      • Languages45985
      • Data Modelling & AI17565
      • Java15156
      • Android13919
      • Mobile12983
      • Javascript12713
      • Guest Blogs12665
      • Data Structure & Algorithm10077
      Logo

      ABOUT US

      We provide you with the latest breaking news and videos straight from the technology industry.

      Contact us: hello@geeksforgeeks.org

      FOLLOW US

      Blogger
      Facebook
      Flickr
      Instagram
      VKontakte

      © NeverOpen 2022

      • Home
      • News
      • Data Modelling & AI
      • Mobile
      • Languages
      • Guest Blogs
      • Discussion
      • Our Team