Friday, December 27, 2024
Google search engine
HomeLanguagesPython – Successive Characters Frequency

Python – Successive Characters Frequency

Sometimes, while working with Python strings, we can have a problem in which we need to find the frequency of next character of a particular word in string. This is quite unique problem and has the potential for application in day-day programming and web development. Let’s discuss certain ways in which this task can be performed.

Input : test_str = 'Lazyroar are for neveropen', que_word = "geek" 
Output : {'s': 3} 
Input : test_str = 'geek', que_word = "geek" 
Output : {}

Method #1 : Using loop + count() + re.findall() 

The combination of the above methods constitutes the brute force method to perform this task. In this, we perform the task of counting using count(), and the character is searched using findall() function. 

Python3




# Python3 code to demonstrate working of
# Successive Characters Frequency
# Using count() + loop + re.findall()
import re
     
# initializing string
test_str = 'neveropen is best for Lazyroar. A geek should take interest.'
 
# printing original string
print("The original string is : " + str(test_str))
 
# initializing word
que_word = "geek"
 
# Successive Characters Frequency
# Using count() + loop + re.findall()
temp = []
for sub in re.findall(que_word + '.', test_str):
    temp.append(sub[-1])
 
res = {que_word : temp.count(que_word) for que_word in temp}
 
# printing result
print("The Characters Frequency is : " + str(res))


Output : 

The original string is : neveropen is best for Lazyroar. A geek should take interest.
The Characters Frequency is : {'s': 3, ' ': 1}

Method #2 : Using Counter() + list comprehension + re.findall() 

The combination of the above functions is used to perform the following task. In this, we use Counter() instead of count() to solve this problem. Works with newer versions of Python. 

Python3




# Python3 code to demonstrate working of
# Successive Characters Frequency
# Using Counter() + list comprehension + re.findall()
from collections import Counter
import re
 
# initializing string
test_str = 'neveropen is best for Lazyroar. A geek should take interest.'
 
# printing original string
print("The original string is : " + str(test_str))
 
# initializing word
que_word = "geek"
 
# Successive Characters Frequency
# Using Counter() + list comprehension + re.findall()
res = dict(Counter(re.findall(f'{que_word}(.)', test_str,
                              flags=re.IGNORECASE)))
 
# printing result
print("The Characters Frequency is : " + str(res))


Output : 

The original string is : neveropen is best for Lazyroar. A geek should take interest.
The Characters Frequency is : {'s': 3, ' ': 1}

Time Complexity: O(n)
Auxiliary Space: O(n)

Method #3 : Using operator.countOf() 

Python3




# Python3 code to demonstrate working of
# Successive Characters Frequency
# Using operator.countOf() + loop + re.findall()
import re
import operator as op
 
# initializing string
test_str = 'neveropen is best for Lazyroar. A geek should take interest.'
 
# printing original string
print("The original string is : " + str(test_str))
 
# initializing word
que_word = "geek"
 
# Successive Characters Frequency
# Using operator.countOf() + loop + re.findall()
temp = []
for sub in re.findall(que_word + '.', test_str):
    temp.append(sub[-1])
 
res = {que_word: op.countOf(temp, que_word) for que_word in temp}
 
# printing result
print("The Characters Frequency is : " + str(res))


Output

The original string is : neveropen is best for Lazyroar. A geek should take interest.
The Characters Frequency is : {'s': 3, ' ': 1}

Time Complexity: O(n)
Auxiliary Space: O(n)

Method 4: Using a loop and dictionary

  1. Initialize the input string and the queried word.
  2. Initialize an empty dictionary to store the frequency of successive characters.
  3. Loop through the input string, checking if each substring of length len(que_word) starting at each index of the string is equal to the queried word.
  4. If a substring is equal to the queried word, extract the character immediately following the substring.
  5. If the character is already a key in the dictionary, increment its value by 1. Otherwise, add the character as a key with a value of 1.
  6. Once the loop completes, print the dictionary with the character frequencies.

Example:

Python3




# initializing string
test_str = 'neveropen is best for Lazyroar. A geek should take interest.'
 
# initializing word
que_word = 'geek'
 
# initializing dictionary to store character frequencies
freq_dict = {}
 
# loop through the string and count successive character frequencies
for i in range(len(test_str)-1):
    if test_str[i:i+len(que_word)] == que_word:
        char = test_str[i+len(que_word)]
        if char in freq_dict:
            freq_dict[char] += 1
        else:
            freq_dict[char] = 1
 
# print the result
print('The Characters Frequency is:', freq_dict)


Output

The Characters Frequency is: {'s': 3, ' ': 1}

Time complexity: O(n), where n is the length of the input string. 
Auxiliary space: O(n), as we are storing a dictionary with potentially n/2 keys (if every character in the string follows the queried word) and their corresponding frequencies.

Method #5: Using regex search() and defaultdict()

Step-by-step approach:

  1. Initialize the input string test_str to the value ‘neveropen is best for Lazyroar. A geek should take interest.’.
  2. Initialize the query word que_word to the value ‘geek’.
  3. Initialize an empty dictionary freq_dict using defaultdict(int), which allows us to set the initial value of each key to 0.
  4. Loop through all the matches of the regular expression pattern que_word + ‘(.)’ in the input string test_str.
  5. For each match, retrieve the character following the query word, which is captured in the first group of the regular expression pattern. Increment the count of that character in the freq_dict dictionary.
  6. After processing all the matches, print the frequency dictionary as a regular dictionary using the dict() constructor.
  7. The output of the program is the characters frequency dictionary, where the keys are the characters following the query word in the input string and the values are their respective frequencies.

Example:

Python3




import re
from collections import defaultdict
 
test_str = 'neveropen is best for Lazyroar. A geek should take interest.'
que_word = 'geek'
 
freq_dict = defaultdict(int)
 
for match in re.finditer(que_word + '(.)', test_str):
    freq_dict[match.group(1)] += 1
 
print('The Characters Frequency is:', dict(freq_dict))


Output

The Characters Frequency is: {'s': 3, ' ': 1}

Time Complexity: O(n), where n is the length of the input string.
Auxiliary Space: O(k), where k is the number of distinct characters following the query word.

Method  #6: Using itertools.groupby() and Counter()

Step-by-step approach:

  • Import the itertools and Counter modules.
  • Use the re.findall() function to find all occurrences of the query word followed by a character in the input string test_str.
  • Use the itertools.groupby() function to group the characters following the query word.
  • Use the Counter() function to count the frequency of each group.
  • Print the result.

Python3




import re
import itertools
from collections import Counter
 
test_str = 'neveropen is best for Lazyroar. A geek should take interest.'
que_word = 'geek'
 
matches = re.findall(que_word + '(.)', test_str)
groups = itertools.groupby(matches)
 
freq_dict = Counter([char for _, char_group in groups for char in char_group])
 
print('The Characters Frequency is:', freq_dict)


Output

The Characters Frequency is: Counter({'s': 3, ' ': 1})

Time complexity: O(n), where n is the length of the input string test_str.
Auxiliary space: O(k), where k is the number of unique characters following the query word.

Dominic Rubhabha-Wardslaus
Dominic Rubhabha-Wardslaushttp://wardslaus.com
infosec,malicious & dos attacks generator, boot rom exploit philanthropist , wild hacker , game developer,
RELATED ARTICLES

Most Popular

Recent Comments