Saturday, December 28, 2024
Google search engine
HomeData Modelling & AICount of occurrences of each prefix in a string using modified KMP...

Count of occurrences of each prefix in a string using modified KMP algorithm

Given a string S of size N, the task is to count the occurrences of all the prefixes of the given string S.

Examples:  

Input: S = “AAAA” 
Output: 
A occurs 4 times 
AA occurs 3 times. 
AAA occurs 2 times. 
AAAA occurs 1 times. 
Explanation: 
Below is the illustration of all the prefix: 
 

Input: S = “ABACABA” 
Output: 
A occurs 4 times 
AB occurs 2 times 
ABA occurs 2 times 
ABAC occurs 1 times 
ABACA occurs 1 times 
ABACAB occurs 1 times 
ABACABA occurs 1 times 
 

Naive Approach:  

  1. Traverse over all the prefixes in set P. Let the x be the prefix.
  2. Do a sliding window approach of size |x|.
  3. Check if the current sliding window on S is equal to x. If yes then increase the count[x] by 1.

Time complexity: O(N3
Auxiliary Space: O(N)

Efficient Approach: 
Use the LPS array (also called prefix_function) from the KMP algorithm
The prefix function for this string is defined as an array LPS of length N, where LPS[i] is the length of the longest proper prefix of the substring S[0…i] which is also a suffix of this substring. Let occ[i] denote the number of occurrences of the prefix of length i.

Below are the steps to implement this approach:  

  1. Compute the LPS array or prefix_function.
  2. For each value of the prefix function, first, count how many times it occurs in the LPS array.
  3. The length prefix i appears exactly ans[i] times, then this number must be added to the number of occurrences of its longest suffix that is also a prefix.
  4. In the end, add 1 to all the values of occ array, because of the original prefix that should be counted as well.

For example: 
LPS[i] denotes that in position i, a prefix of length = LPS[i] appears. And this is the longest prefix possible. But shorter prefixes can occur. 
For String S = “AAAA”, following are the prefixes: 
 

S[0..0] = A 
S[0..1] = AA 
S[0..2] = AAA 
S[0..3] = AAAA 

Initially:  

occ[A] = 0 
occ[AA] = 0 
occ[AAA] = 0 
occ[AAAA] = 0 

Step1: LPS Array of the following string denotes the length of the longest prefix which is also a suffix: 

LPS[1] denotes in string AA, A is a suffix and also a prefix as LPS[1] = 1 
LPS[2] denotes in string AAA, AA is a suffix and also a prefix as LPS[2] = 2 
LPS[3] denotes in string AAAA, AAA is a suffix and also a prefix as LPS[3] = 3

Step 2:Add these occurrences of prefixes as suffixes to the answer in the occ[] array:  

Values : Counted substrings 
occ[A] = 1 : S[1] 
occ[AA] = 1 : S[1..2] 
occ[AAA] = 1 : S[1..3] 
occ[AAAA] = 0 : NULL(as there is not a prefix “AAAA” which is also a suffix. 
 

Step 3: Now traverse the string in reverse order starting from “AAA” (as the last value will always be 0 since the complete string is not a proper prefix). 

Since, string “AAA” S[1..3] contains “AA” S[2..3] as well, which was not counted yet, therefore increment the occurrence of string “AA” in occ[“AA”] as occ[“AA”] += occ[“AAA”]. Below is the count for the same: 
Values : Counted substrings 
occ[A] = 1 : S[1] 
occ[AA] = 2 : S[1..2], S[2..3] 
occ[AAA] = 1 : S[1..3] 
occ[AAAA] = 0 : NULL 

Now string “AA” contains “A” as well, which was not counted yet, therefore increment the occurrence of string “A” in occ[“A”] as occ[“A”] += occ[“AA”]. Below is the count for the same: 
 

Values : Counted substrings 
occ[A] = 3 : S[1], S[2], S[3] 
occ[AA] = 2 : S[1..2], S[2..3] 
occ[AAA] = 1 : S[1..3] 
occ[AAAA] = 0 : NULL 

Step 4: At last add one to all occurrences for the original prefixes, which are not counted yet.  

Values : Counted substrings 
occ[A] = 4 : S[1], S[2], S[3], S[0] 
occ[AA] = 3 : S[1..2], S[2..3], S[0..1] 
occ[AAA] = 2 : S[1..3], S[0..2] 
occ[AAAA] = 1 : S[0..3]  

Below is the implementation of the above approach:

C++




// C++ program for the above approach
#include <bits/stdc++.h>
using namespace std;
 
// Function to print the count of all
// prefix in the given string
void print(vector<int>& occ, string& s)
{
    // Iterate over string s
    for (int i = 1; i <= int(s.size());
         i++) {
 
        // Print the prefix and their
        // frequency
        cout << s.substr(0, i)
             << " occurs "
             << occ[i]
             << " times."
             << endl;
    }
}
 
// Function to implement the LPS
// array to store the longest prefix
// which is also a suffix for every
// substring of the string S
vector<int> prefix_function(string& s)
{
    // Array to store LPS values
    vector<int> LPS(s.size());
 
    // Value of lps[0] is 0
    // by definition
    LPS[0] = 0;
 
    // Find the values of LPS[i] for
    // the rest of the string using
    // two pointers and DP
    for (int i = 1;
         i < int(s.size());
         i++) {
 
        // Initially set the value
        // of j as the longest
        // prefix that is also a
        // suffix for i as LPS[i-1]
        int j = LPS[i - 1];
 
        // Check if the suffix of
        // length j+1 is also a prefix
        while (j > 0 && s[i] != s[j]) {
            j = LPS[j - 1];
        }
 
        // If s[i] = s[j] then, assign
        // LPS[i] as j+1
        if (s[i] == s[j]) {
            LPS[i] = j + 1;
        }
 
        // If we reached j = 0, assign
        // LPS[i] as 0 as there was no
        // prefix equal to suffix
        else {
            LPS[i] = 0;
        }
    }
 
    // Return the calculated
    // LPS array
    return LPS;
}
 
// Function to count the occurrence
// of all the prefix in the string S
void count_occurrence(string& s)
{
    int n = s.size();
 
    // Call the prefix_function
    // to get LPS
    vector<int> LPS
        = prefix_function(s);
 
    // To store the occurrence of
    // all the prefix
    vector<int> occ(n + 1);
 
    // Count all the suffixes that
    // are also prefix
    for (int i = 0; i < n; i++) {
        occ[LPS[i]]++;
    }
 
    // Add the occurrences of
    // i to smaller prefixes
    for (int i = n - 1;
         i > 0; i--) {
        occ[LPS[i - 1]] += occ[i];
    }
 
    // Adding 1 to all occ[i] for all
    // the original prefix
    for (int i = 0; i <= n; i++)
        occ[i]++;
 
    // Function Call to print the
    // occurrence of all the prefix
    print(occ, s);
}
 
// Driver Code
int main()
{
    // Given String
    string A = "ABACABA";
 
    // Function Call
    count_occurrence(A);
    return 0;
}


Java




// Java program for
// the above approach
import java.util.*;
class GFG{
 
// Function to print the count
// of all prefix in the
// given String
static void print(int[] occ,
                  String s)
{
  // Iterate over String s
  for (int i = 1;
           i <= s.length() - 1; i++)
  {
    // Print the prefix and their
    // frequency
    System.out.print(s.substring(0, i) +
                     " occurs " + occ[i] +
                     " times." + "\n");
  }
}
 
// Function to implement the LPS
// array to store the longest prefix
// which is also a suffix for every
// subString of the String S
static int[] prefix_function(String s)
{
  // Array to store LPS values
  int []LPS = new int[s.length()];
 
  // Value of lps[0] is 0
  // by definition
  LPS[0] = 0;
 
  // Find the values of LPS[i] for
  // the rest of the String using
  // two pointers and DP
  for (int i = 1;
       i < s.length(); i++)
  {
    // Initially set the value
    // of j as the longest
    // prefix that is also a
    // suffix for i as LPS[i-1]
    int j = LPS[i - 1];
 
    // Check if the suffix of
    // length j+1 is also a prefix
    while (j > 0 &&
           s.charAt(i) != s.charAt(j))
    {
      j = LPS[j - 1];
    }
 
    // If s[i] = s[j] then, assign
    // LPS[i] as j+1
    if (s.charAt(i) == s.charAt(j))
    {
      LPS[i] = j + 1;
    }
 
    // If we reached j = 0, assign
    // LPS[i] as 0 as there was no
    // prefix equal to suffix
    else
    {
      LPS[i] = 0;
    }
  }
 
  // Return the calculated
  // LPS array
  return LPS;
}
 
// Function to count the occurrence
// of all the prefix in the String S
static void count_occurrence(String s)
{
  int n = s.length();
 
  // Call the prefix_function
  // to get LPS
  int[] LPS = prefix_function(s);
 
  // To store the occurrence of
  // all the prefix
  int []occ = new int[n + 1];
 
  // Count all the suffixes that
  // are also prefix
  for (int i = 0; i < n; i++)
  {
    occ[LPS[i]]++;
  }
 
  // Add the occurrences of
  // i to smaller prefixes
  for (int i = n - 1;
           i > 0; i--)
  {
    occ[LPS[i - 1]] += occ[i];
  }
 
  // Adding 1 to all occ[i] for all
  // the original prefix
  for (int i = 0; i <= n; i++)
    occ[i]++;
 
  // Function Call to print the
  // occurrence of all the prefix
  print(occ, s);
}
 
// Driver Code
public static void main(String[] args)
{
  // Given String
  String A = "ABACABA";
 
  // Function Call
  count_occurrence(A);
}
}
 
// This code is contributed by Princi Singh


Python3




# Python3 program for the above approach
 
# Function to print the count of all
# prefix in the given string
def Print(occ, s):
     
    # Iterate over string s
    for i in range(1, len(s) + 1):
 
        # Print the prefix and their
        # frequency
        print(s[0 : i], "occur", occ[i], "times.")
 
# Function to implement the LPS
# array to store the longest prefix
# which is also a suffix for every
# substring of the string S
def prefix_function(s):
 
    # Array to store LPS values
    # Value of lps[0] is 0
    # by definition
    LPS = [0 for i in range(len(s))]
     
    # Find the values of LPS[i] for
    # the rest of the string using
    # two pointers and DP
    for i in range(1, len(s)):
 
        # Initially set the value
        # of j as the longest
        # prefix that is also a
        # suffix for i as LPS[i-1]
        j = LPS[i - 1]
 
        # Check if the suffix of
        # length j+1 is also a prefix
        while (j > 0 and s[i] != s[j]):
            j = LPS[j - 1]
 
        # If s[i] = s[j] then, assign
        # LPS[i] as j+1
        if (s[i] == s[j]):
            LPS[i] = j + 1
             
        # If we reached j = 0, assign
        # LPS[i] as 0 as there was no
        # prefix equal to suffix
        else:
            LPS[i] = 0
 
    # Return the calculated
    # LPS array
    return LPS
 
# Function to count the occurrence
# of all the prefix in the string S
def count_occurrence(s):
     
    n = len(s)
 
    # Call the prefix_function
    # to get LPS
    LPS = prefix_function(s)
 
    # To store the occurrence of
    # all the prefix
    occ = [0 for i in range(n + 1)]
 
    # Count all the suffixes that
    # are also prefix
    for i in range(n):
        occ[LPS[i]] += 1
 
    # Add the occurrences of
    # i to smaller prefixes
    for i in range(n - 1, 0, -1):
        occ[LPS[i - 1]] += occ[i]
     
    # Adding 1 to all occ[i] for all
    # the original prefix
    for i in range(n + 1):
        occ[i] += 1
         
    # Function Call to print the
    # occurrence of all the prefix
    Print(occ, s)
 
# Driver Code
 
# Given String
A = "ABACABA"
 
# Function Call
count_occurrence(A)
 
# This code is contributed by avanitrachhadiya2155


C#




// C# program for
// the above approach
using System;
class GFG{
 
// Function to print the
// count of all prefix
// in the given String
static void print(int[] occ,
                  String s)
{
  // Iterate over String s
  for (int i = 1;
           i <= s.Length - 1; i++)
  {
    // Print the prefix and their
    // frequency
    Console.Write(s.Substring(0, i) + 
                  " occurs " + occ[i] + 
                  " times." + "\n");
  }
}
 
// Function to implement the LPS
// array to store the longest prefix
// which is also a suffix for every
// subString of the String S
static int[] prefix_function(String s)
{
  // Array to store LPS values
  int []LPS = new int[s.Length];
 
  // Value of lps[0] is 0
  // by definition
  LPS[0] = 0;
 
  // Find the values of LPS[i] for
  // the rest of the String using
  // two pointers and DP
  for (int i = 1;
           i < s.Length; i++)
  {
    // Initially set the value
    // of j as the longest
    // prefix that is also a
    // suffix for i as LPS[i-1]
    int j = LPS[i - 1];
 
    // Check if the suffix of
    // length j+1 is also a prefix
    while (j > 0 && s[i] != s[j])
    {
      j = LPS[j - 1];
    }
 
    // If s[i] = s[j] then,
    // assign LPS[i] as j+1
    if (s[i] == s[j])
    {
      LPS[i] = j + 1;
    }
 
    // If we reached j = 0, assign
    // LPS[i] as 0 as there was no
    // prefix equal to suffix
    else
    {
      LPS[i] = 0;
    }
  }
 
  // Return the calculated
  // LPS array
  return LPS;
}
 
// Function to count the occurrence
// of all the prefix in the String S
static void count_occurrence(String s)
{
  int n = s.Length;
 
  // Call the prefix_function
  // to get LPS
  int[] LPS = prefix_function(s);
 
  // To store the occurrence of
  // all the prefix
  int []occ = new int[n + 1];
 
  // Count all the suffixes that
  // are also prefix
  for (int i = 0; i < n; i++)
  {
    occ[LPS[i]]++;
  }
 
  // Add the occurrences of
  // i to smaller prefixes
  for (int i = n - 1;
           i > 0; i--)
  {
    occ[LPS[i - 1]] += occ[i];
  }
 
  // Adding 1 to all occ[i] for all
  // the original prefix
  for (int i = 0; i <= n; i++)
    occ[i]++;
 
  // Function Call to print the
  // occurrence of all the prefix
  print(occ, s);
}
 
// Driver Code
public static void Main(String[] args)
{
  // Given String
  String A = "ABACABA";
 
  // Function Call
  count_occurrence(A);
}
}
 
// This code is contributed by Amit Katiyar


Javascript




<script>
    // JavaScript program for the above approach
 
    // Function to print the count of all
    // prefix in the given string
    const print = (occ, s) => {
        // Iterate over string s
        for (let i = 1; i <= s.length; i++) {
 
            // Print the prefix and their
            // frequency
            document.write(`${s.substr(0, i)} occurs ${occ[i]} times.<br/>`);
        }
    }
 
    // Function to implement the LPS
    // array to store the longest prefix
    // which is also a suffix for every
    // substring of the string S
    const prefix_function = (s) => {
        // Array to store LPS values
        let LPS = new Array(s.length).fill(0);
 
        // Value of lps[0] is 0
        // by definition
        LPS[0] = 0;
 
        // Find the values of LPS[i] for
        // the rest of the string using
        // two pointers and DP
        for (let i = 1; i < s.length; i++) {
 
            // Initially set the value
            // of j as the longest
            // prefix that is also a
            // suffix for i as LPS[i-1]
            let j = LPS[i - 1];
 
            // Check if the suffix of
            // length j+1 is also a prefix
            while (j > 0 && s[i] != s[j]) {
                j = LPS[j - 1];
            }
 
            // If s[i] = s[j] then, assign
            // LPS[i] as j+1
            if (s[i] == s[j]) {
                LPS[i] = j + 1;
            }
 
            // If we reached j = 0, assign
            // LPS[i] as 0 as there was no
            // prefix equal to suffix
            else {
                LPS[i] = 0;
            }
        }
 
        // Return the calculated
        // LPS array
        return LPS;
    }
 
    // Function to count the occurrence
    // of all the prefix in the string S
    const count_occurrence = (s) => {
        let n = s.length;
 
        // Call the prefix_function
        // to get LPS
        let LPS = prefix_function(s);
 
        // To store the occurrence of
        // all the prefix
        let occ = new Array(n + 1).fill(0);
 
        // Count all the suffixes that
        // are also prefix
        for (let i = 0; i < n; i++) {
            occ[LPS[i]]++;
        }
 
        // Add the occurrences of
        // i to smaller prefixes
        for (let i = n - 1;
            i > 0; i--) {
            occ[LPS[i - 1]] += occ[i];
        }
 
        // Adding 1 to all occ[i] for all
        // the original prefix
        for (let i = 0; i <= n; i++)
            occ[i]++;
 
        // Function Call to print the
        // occurrence of all the prefix
        print(occ, s);
    }
 
    // Driver Code
 
    // Given String
    let A = "ABACABA";
 
    // Function Call
    count_occurrence(A);
 
    // This code is contributed by rakeshsahni
 
</script>


Output: 

A occurs 4 times.
AB occurs 2 times.
ABA occurs 2 times.
ABAC occurs 1 times.
ABACA occurs 1 times.
ABACAB occurs 1 times.
ABACABA occurs 1 times.

 

Time Complexity: O(N2) 
Auxiliary Space: O(N)

Feeling lost in the world of random DSA topics, wasting time without progress? It’s time for a change! Join our DSA course, where we’ll guide you on an exciting journey to master DSA efficiently and on schedule.
Ready to dive in? Explore our Free Demo Content and join our DSA course, trusted by over 100,000 neveropen!

RELATED ARTICLES

Most Popular

Recent Comments