Friday, January 3, 2025
Google search engine
HomeData Modelling & AIWhy Arrays have better cache locality than Linked list?

Why Arrays have better cache locality than Linked list?

What is a Cache locality?

Cache locality refers to the property of computer programs where they tend to access a small, specific set of data repeatedly in a short period of time. By keeping this frequently accessed data in a small, fast memory region called a cache, the program can speed up its execution time. This is because accessing data from the cache is faster than accessing data from the main memory because cache memory is faster and has smaller memory that is closer to the CPU.

By designing programs that take advantage of cache locality, we can significantly improve the performance of any software. This is particularly important in areas where performance is critical, such as scientific computing, gaming, and real-time systems.

What are the factors that affect cache locality?

There are several factors that can affect cache locality, including:

  • Spatial locality: This refers to the property where data that is located near each other in memory have a higher probability to be accessed together. Programs with good spatial locality exhibit better cache performance.
  • Temporal locality: This refers to the property where data that is accessed repeatedly in a short duration of time is more likely to be accessed again in the near future. Programs that exhibit temporal locality tend to benefit from caching.
  • Data size: The cache might not be able to store large data structures, which results in cache misses and reduced cache performance.
  • Cache size: Cache performance can be impacted due to cache size. Caches that are smaller in size may not be able to store all of the data that a program needs to access, whereas caches that are larger in size may have more cache misses due to a higher probability of storing data that is not accessed frequently.

Why do arrays have a better cache locality than a linked list?

  • The array is a collection of elements of the same data type stored at contiguous memory locations, whereas LinkedList is a collection of objects called nodes that are randomly stored in the memory. 
  • In easy words, Imagine you have a bunch of toys that you want to play with. If the toys are all arranged next to you, you can grab them quickly and easily. But if the toys are scattered all over the room, you have to get up and walk around to find each one. This takes more time and is harder to do.
  • Similarly, when a computer needs to work with a bunch of data, it’s faster and easier for it to access the data stored near each other in memory, like an array. But if the data is scattered all over memory, like in a linked list, the computer has to work harder to find each piece of data. This slows down the computer and makes it work less efficiently.

Representation of arrays in memory

Representation of linked list in memory

  • Arrays have better cache locality than linked lists because they store their elements contiguously in memory, while linked lists store their elements in a non-contiguous manner. This means that when an array is accessed, the processor can retrieve multiple elements that are located next to each other in memory and store them in its cache for faster access later. In contrast, with a linked list, the processor needs to access each element separately, which can result in cache misses and slower access times.

Let us understand with the help of an example. In the following code we will create an array of one million integers and a linked list of one million integers, then compare the time it takes to sum all the values in both data structures.

C++




// C++ code
#include <chrono>
#include <iostream>
#include <numeric>
 
using namespace std;
using namespace std::chrono;
 
int main()
{
 
    // Create an array of 1000000 integers
    int arr[1000000];
    for (int i = 0; i < 1000000; i++) {
        arr[i] = i;
    }
 
    // Create a linked list of
    // 1000000 integers
    struct Node {
        int value;
        Node* next;
    };
 
    Node* head = new Node;
    head->value = 0;
    Node* prev = head;
 
    for (int i = 1; i < 1000000; i++) {
        Node* node = new Node;
        node->value = i;
        prev->next = node;
        prev = node;
    }
 
    // Measure the time it takes to sum
    // all the values in the array
    auto start_time = high_resolution_clock::now();
    int arr_sum = accumulate(arr, arr + 1000000, 0);
    auto end_time = high_resolution_clock::now();
    auto duration = duration_cast<microseconds>(
        end_time - start_time);
    cout << "Sum of array: " << arr_sum
         << ". Time taken: " << duration.count() / 1000000.0
         << " seconds." << endl;
 
    // Measure the time it takes to sum
    // all the values in the linked list
    start_time = high_resolution_clock::now();
    Node* curr = head;
    int ll_sum = 0;
    while (curr != nullptr) {
        ll_sum += curr->value;
        curr = curr->next;
    }
    end_time = high_resolution_clock::now();
    duration = duration_cast<microseconds>(end_time
                                           - start_time);
    cout << "Sum of linked list: " << ll_sum
         << ". Time taken: " << duration.count() / 1000000.0
         << " seconds." << endl;
 
    return 0;
}


Java




import java.time.Duration;
import java.time.Instant;
 
public class Main {
 
    public static void main(String[] args) {
 
        // Create an array of 1000000 integers using a generator expression
        int[] arr = new int[1000000];
        for (int i = 0; i < 1000000; i++) {
            arr[i] = i;
        }
 
        // Create a linked list of 1000000 integers
        Node head = new Node(0);
        Node prev = head;
 
        for (int i = 1; i < 1000000; i++) {
            Node node = new Node(i);
            prev.next = node;
            prev = node;
        }
 
        // Measure the time it takes to sum all the values in the array
        Instant startTime = Instant.now();
        int arrSum = 0;
        for (int i = 0; i < 1000000; i++) {
            arrSum += arr[i];
        }
        Instant endTime = Instant.now();
        Duration duration = Duration.between(startTime, endTime);
        System.out.printf("Sum of array: %d. Time taken: %f seconds.%n", arrSum, duration.toMillis() / 1000.0);
 
        // Measure the time it takes to sum all the values in the linked list
        startTime = Instant.now();
        int llSum = 0;
        Node curr = head;
        while (curr != null) {
            llSum += curr.value;
            curr = curr.next;
        }
        endTime = Instant.now();
        duration = Duration.between(startTime, endTime);
        System.out.printf("Sum of linked list: %d. Time taken: %f seconds.%n", llSum, duration.toMillis() / 1000.0);
    }
}
 
class Node {
    int value;
    Node next;
 
    public Node(int value) {
        this.value = value;
    }
}


Python3




from datetime import datetime
 
# Create an array of 1000000 integers using a generator expression
arr = (i for i in range(1000000))
 
# Create a linked list of 1000000 integers
 
 
class Node:
    def __init__(self, value=0, next=None):
        self.value = value
        self.next = next
 
 
head = Node(0)
prev = head
 
for i in range(1, 1000000):
    node = Node(i)
    prev.next = node
    prev = node
 
# Measure the time it takes to sum all the values in the array
start_time = datetime.now()
arr_sum = sum(arr)
end_time = datetime.now()
duration = end_time - start_time
print(
    f"Sum of array: {arr_sum}. Time taken: {duration.total_seconds()} seconds.")
 
# Measure the time it takes to sum all the values in the linked list
start_time = datetime.now()
curr = head
ll_sum = 0
while curr is not None:
    ll_sum += curr.value
    curr = curr.next
end_time = datetime.now()
duration = end_time - start_time
print(
    f"Sum of linked list: {ll_sum}. Time taken: {duration.total_seconds()} seconds.")


C#




using System;
using System.Diagnostics;
 
namespace CodeTranslationAssistant
{
    class Program
    {
           
        class Node
        {
            public int value;
            public Node next;
        }
 
        static void Main(string[] args)
        {
              // Create an array of 1000000 integers
            int[] arr = new int[1000000];
            for (int i = 0; i < 1000000; i++)
            {
                arr[i] = i;
            }
 
           
              // Create a linked list of
                // 1000000 integers
            Node head = new Node();
            head.value = 0;
            Node prev = head;
            for (int i = 1; i < 1000000; i++)
            {
                Node node = new Node();
                node.value = i;
                prev.next = node;
                prev = node;
            }
             
           
              // Measure the time it takes to sum
                // all the values in the array
            var start_time = Stopwatch.GetTimestamp();
            int arr_sum = 0;
            for (int i = 0; i < 1000000; i++)
            {
                arr_sum += arr[i];
            }
            var end_time = Stopwatch.GetTimestamp();
            var duration = (end_time - start_time) / (double)Stopwatch.Frequency;
            Console.WriteLine("Sum of array: " + arr_sum + ". Time taken: " + duration + " seconds.");
 
           
              // Measure the time it takes to sum
            // all the values in the linked list
            start_time = Stopwatch.GetTimestamp();
            Node curr = head;
            int ll_sum = 0;
            while (curr != null)
            {
                ll_sum += curr.value;
                curr = curr.next;
            }
            end_time = Stopwatch.GetTimestamp();
            duration = (end_time - start_time) / (double)Stopwatch.Frequency;
            Console.WriteLine("Sum of linked list: " + ll_sum + ". Time taken: " + duration + " seconds.");
        }
    }
}


Javascript




const { performance } = require('perf_hooks');
 
// Create an array of 1000000 integers using a generator expression
const arr = Array.from({length: 1000000}, (_, i) => i);
 
// Create a linked list of 1000000 integers
class Node {
  constructor(value = 0, next = null) {
    this.value = value;
    this.next = next;
  }
}
 
let head = new Node(0);
let prev = head;
 
for (let i = 1; i < 1000000; i++) {
  let node = new Node(i);
  prev.next = node;
  prev = node;
}
 
// Measure the time it takes to sum all the values in the array
let startTime = performance.now();
let arrSum = arr.reduce((acc, curr) => acc + curr, 0);
let endTime = performance.now();
let duration = (endTime - startTime) / 1000;
console.log(`Sum of array: ${arrSum}. Time taken: ${duration} seconds.`);
 
// Measure the time it takes to sum all the values in the linked list
startTime = performance.now();
let curr = head;
let llSum = 0;
while (curr !== null) {
  llSum += curr.value;
  curr = curr.next;
}
endTime = performance.now();
duration = (endTime - startTime) / 1000;
console.log(`Sum of linked list: ${llSum}. Time taken: ${duration} seconds.`);


Output

Sum of array: 1783293664. Time taken: 0.000854 seconds.
Sum of linked list: 1783293664. Time taken: 0.004918 seconds.

Conclusion:

  • Since the array stores its elements contiguously in memory, it has a better cache locality than the linked list, which stores its elements non-contiguously. As a result, summing the values in the array takes less time than summing the values in the linked list, as shown by the output of the program.
  • Therefore, if you need to access elements in sequential order or perform any other operation that requires accessing data in a contiguous manner, it is more efficient to use an array than a linked list.
Feeling lost in the world of random DSA topics, wasting time without progress? It’s time for a change! Join our DSA course, where we’ll guide you on an exciting journey to master DSA efficiently and on schedule.
Ready to dive in? Explore our Free Demo Content and join our DSA course, trusted by over 100,000 neveropen!

RELATED ARTICLES

Most Popular

Recent Comments