Facebook Instagram Twitter Vimeo Youtube
Sign in
  • Home
  • About
  • Team
  • Buy now!
Sign in
Welcome!Log into your account
Forgot your password?
Privacy Policy
Password recovery
Recover your password
Search
Logo
Sign in
Welcome! Log into your account
Forgot your password? Get help
Privacy Policy
Password recovery
Recover your password
A password will be e-mailed to you.
Thursday, September 4, 2025
Sign in / Join
  • Contact Us
  • Our Team
Facebook
Instagram
Twitter
Vimeo
Youtube
Logo
  • Home
  • News
    • News

      Anthropic Confirms Claude AI Was Weaponized in Major Cyberattacks by Husain Parvez

      3 September 2025
      News

      Over 30,000 Malicious IPs Target Microsoft Remote Desktop in Global Surge by Husain Parvez

      31 August 2025
      News

      Cyber Threat-Sharing Law Nears Expiration: Experts Warn of Risks by Husain Parvez

      31 August 2025
      News

      North Korean Hacking Tools Leak Online, Including Advanced Linux Rootkit by Paige Henley

      28 August 2025
      News

      iiNet Cyberattack Exposes Data of 280,000 Customers by Husain Parvez

      28 August 2025
  • Data Modelling & AI
    • AllBig dataBusiness AnalyticsData ScienceData Structure & AlgorithmDatabasesVector DatabaseDeep LearningEthical HackingGenerative AIMachine Learning
      Big data

      LangExtract + Milvus: A Practical Guide to Building a Hybrid Document Processing and Search System

      30 August 2025
      Big data

      Stop Your AI Assistant from Writing Outdated Code with Milvus SDK Code Helper

      26 August 2025
      Big data

      A Practical Guide for Choosing the Right Vector Database for Your AI Applications

      26 August 2025
      Big data

      Why I’m Against Claude Code’s Grep-Only Retrieval? It Just Burns Too Many Tokens

      26 August 2025
    • Big data
    • Business Analytics
    • Databases
    • Data Structure & Algorithm
    • Data Science
    • Deep Learning
    • Ethical Hacking
    • Generative AI
    • Machine Learning
    • Security & Testing
  • Mobile
    • AllAndroidIOS
      Android

      It’s your last chance to score a $50 Samsung credit before tomorrow’s big product announcement

      4 September 2025
      Android

      The Samsung Health app now puts a licensed doctor right in your pocket

      3 September 2025
      Android

      Google’s NotebookLM is giving Audio Overviews new personalities

      3 September 2025
      Android

      MediaTek’s next flagship chip may give future Android phones faster cores and a beefed-up NPU

      3 September 2025
    • Android
    • IOS
  • Languages
    • AllAjaxAngularDynamic ProgrammingGolangJavaJavascriptPhpPythonReactVue
      Languages

      Working with Titles and Heading – Python docx Module

      25 June 2025
      Languages

      Creating a Receipt Calculator using Python

      25 June 2025
      Languages

      One Liner for Python if-elif-else Statements

      25 June 2025
      Languages

      Add Years to datetime Object in Python

      25 June 2025
    • Java
    • Python
  • Guest Blogs
  • Discussion
  • Our Team
HomeData Modelling & AIBig data15 Data Structures that Power Distributed Databases
Big dataGuest Blogs

15 Data Structures that Power Distributed Databases

Algomaster
By Algomaster
15 June 2025
0
0
Share
Facebook
Twitter
Pinterest
WhatsApp

    15 Data Structures that Power Distributed Databases

    Ashish Pratap Singh's avatar

    Ashish Pratap Singh
    Mar 06, 2025
    ∙ Paid

    Distributed Databases are the backbone of modern large-scale applications, powering everything from real-time analytics to global e-commerce platforms.

    Behind the scenes, these systems rely on specialized data structures to enable fast lookups, efficient storage, and high-throughput operations, even when managing terabytes of data.

    In this article, we’ll explore 15 key data structures that power modern distributed databases.


    1. Hash Indexes

    A hash index is a data structure that efficiently maps keys to values using a hash function.

    The hash function converts a given key into an integer, which is used as an index in a hash table (buckets) to store and retrieve values.

    This indexing technique is optimized for fast lookups and insertions, making it ideal for operations like:

    • Inserting or finding a record with id = 123

    In most cases, hash indexes provide an O(1) average-time complexity for insertions, deletions, and lookups.

    Hash Indexes are commonly used in key-value stores (e.g., DynamoDB) and caching systems (e.g., Redis) where quick access to data is crucial.


    2. Bloom Filters

    A Bloom filter is a space-efficient, probabilistic data structure used to test set membership.

    It answers the question: “Does this element exist in a set?”

    Unlike traditional data structures, a Bloom filter does not store actual elements, making it extremely memory-efficient.

    It starts as a bit array of size m, initialized with 0s, and relies on k independent hash functions, each of which maps an element to one of the m positions in the bit array.

    How It Works

    • Insertion: When an element is added, it is passed through the k hash functions, each mapping it to an index in the bit array. The bits at these positions are set to 1.

    • Lookup: To check if an element is present, it is again passed through the same k functions.

      • If all corresponding bits are 1, the element is probably in the set (though false positives can occur).

      • If any bit is 0, the element is definitely not in the set.

    Bloom filters allow databases to efficiently check whether a key might exist in a dataset, helping to avoid unnecessary disk lookups in places where the key is guaranteed to be absent. They are widely used in systems like SSTables in LSM trees (e.g., Apache Cassandra) and database partitions for fast key lookups.


    3. LSM Trees (Log-Structured Merge Trees)

    A Log-Structured Merge (LSM) Tree is a write-optimized data structure designed to handle high-throughput workloads efficiently.

    Unlike B-Trees, which modify disk pages directly, LSM Trees buffer writes sequentially in memory and periodically flush them to disk, reducing random I/O operations.

    This makes them ideal for write-heavy workloads.

    How LSM Trees Work

    Writes (Inserts, Updates, Deletes)

    • New writes are first stored in an in-memory structure called a MemTable (typically a Red-Black Tree or Skip List).

    • Once the MemTable reaches a certain size, it is flushed to disk as an immutable SSTable (Sorted String Table).

    • This sequential write pattern ensures fast insertions while avoiding costly disk seeks.

    Reads

    • Reads first check the MemTable (fast in-memory lookups).

    • If not found, the search moves to recent SSTables.

    • A Bloom Filter is often used to quickly determine whether a key exists in an SSTable.

    • If found, the key is retrieved via binary search.

    Compaction (Merging SSTables)

    • Over time, multiple SSTables accumulate, increasing read overhead.

    • To optimize storage and retrieval, the system merges smaller SSTables into larger ones.

    • Compaction removes duplicate, obsolete, or deleted records, reducing disk space.

    LSM Trees are widely used in high-scale NoSQL databases like: Apache Cassandra, Google Bigtable and RocksDB.


    4. Merkle Trees

    This post is for paid subscribers

    Already a paid subscriber? Sign in
    Share
    Facebook
    Twitter
    Pinterest
    WhatsApp
      Previous article
      5 Books Every Software Engineer Should Read (at least once)
      Next article
      Design Uber – System Design Interview
      Algomaster
      Algomasterhttps://blog.algomaster.io
      RELATED ARTICLES
      Guest Blogs

      7 Best 123Movies Alternatives in 2025: Free & Safe Sites by Ivan Stevanovic

      3 September 2025
      Guest Blogs

      Interview with Tyson Garrett – CTO of TrustOnCloud – Making Cloud Threat Modeling Executable by Shauli Zacks

      2 September 2025
      Big data

      LangExtract + Milvus: A Practical Guide to Building a Hybrid Document Processing and Search System

      30 August 2025

      LEAVE A REPLY Cancel reply

      Log in to leave a comment

      Most Popular

      It’s your last chance to score a $50 Samsung credit before tomorrow’s big product announcement

      4 September 2025

      The Samsung Health app now puts a licensed doctor right in your pocket

      3 September 2025

      Google’s NotebookLM is giving Audio Overviews new personalities

      3 September 2025

      MediaTek’s next flagship chip may give future Android phones faster cores and a beefed-up NPU

      3 September 2025
      Load more
      Algomaster
      Algomaster
      202 POSTS0 COMMENTS
      https://blog.algomaster.io
      Calisto Chipfumbu
      Calisto Chipfumbu
      6637 POSTS0 COMMENTS
      http://cchipfumbu@gmail.com
      Dominic
      Dominic
      32260 POSTS0 COMMENTS
      http://wardslaus.com
      Milvus
      Milvus
      81 POSTS0 COMMENTS
      https://milvus.io/
      Nango Kala
      Nango Kala
      6625 POSTS0 COMMENTS
      neverop
      neverop
      0 POSTS0 COMMENTS
      https://geeksforgeeks.org
      Nicole Veronica
      Nicole Veronica
      11795 POSTS0 COMMENTS
      Nokonwaba Nkukhwana
      Nokonwaba Nkukhwana
      11855 POSTS0 COMMENTS
      Safety Detectives
      Safety Detectives
      2594 POSTS0 COMMENTS
      https://www.safetydetectives.com/
      Shaida Kate Naidoo
      Shaida Kate Naidoo
      6747 POSTS0 COMMENTS
      Ted Musemwa
      Ted Musemwa
      7023 POSTS0 COMMENTS
      Thapelo Manthata
      Thapelo Manthata
      6694 POSTS0 COMMENTS
      Umr Jansen
      Umr Jansen
      6714 POSTS0 COMMENTS

      EDITOR PICKS

      It’s your last chance to score a $50 Samsung credit before tomorrow’s big product announcement

      4 September 2025

      The Samsung Health app now puts a licensed doctor right in your pocket

      3 September 2025

      Google’s NotebookLM is giving Audio Overviews new personalities

      3 September 2025

      POPULAR POSTS

      It’s your last chance to score a $50 Samsung credit before tomorrow’s big product announcement

      4 September 2025

      The Samsung Health app now puts a licensed doctor right in your pocket

      3 September 2025

      Google’s NotebookLM is giving Audio Overviews new personalities

      3 September 2025

      POPULAR CATEGORY

      • Languages45985
      • Data Modelling & AI17566
      • Java15156
      • Android14049
      • Mobile12983
      • Javascript12713
      • Guest Blogs12669
      • Data Structure & Algorithm10077
      Logo

      ABOUT US

      We provide you with the latest breaking news and videos straight from the technology industry.

      Contact us: hello@geeksforgeeks.org

      FOLLOW US

      Blogger
      Facebook
      Flickr
      Instagram
      VKontakte

      © NeverOpen 2022

      • Home
      • News
      • Data Modelling & AI
      • Mobile
      • Languages
      • Guest Blogs
      • Discussion
      • Our Team