Sometimes, while working with Python strings, we can have a problem in which we need to perform splitting. This can be of a custom nature. In this, we can have a split in which we need to split by all the repetitions. This can have applications in many domains. Let us discuss certain ways in which this task can be performed.
Method #1: Using * operator + len() This is one of the way in which we can perform this task. In this, we compute the length of the repeated string and then divide the list to obtain root and construct new list using * operator.
Python3
# Python3 code to demonstrate working of # Split by repeating substring # Using * operator + len() # initializing string test_str = "gfggfggfggfggfggfggfggfg" # printing original string print ( "The original string is : " + test_str) # initializing target K = 'gfg' # Split by repeating substring # Using * operator + len() temp = len (test_str) / / len ( str (K)) res = [K] * temp # printing result print ( "The split string is : " + str (res)) |
The original string is : gfggfggfggfggfggfggfggfg The split string is : ['gfg', 'gfg', 'gfg', 'gfg', 'gfg', 'gfg', 'gfg', 'gfg']
Method #2 : Using re.findall() This is yet another way in which this problem can be solved. In this, we use findall() to get all the substrings and split is also performed internally.
Python3
# Python3 code to demonstrate working of # Split by repeating substring # Using re.findall() import re # initializing string test_str = "gfggfggfggfggfggfggfggfg" # printing original string print ( "The original string is : " + test_str) # initializing target K = 'gfg' # Split by repeating substring # Using re.findall() res = re.findall(K, test_str) # printing result print ( "The split string is : " + str (res)) |
The original string is : gfggfggfggfggfggfggfggfg The split string is : ['gfg', 'gfg', 'gfg', 'gfg', 'gfg', 'gfg', 'gfg', 'gfg']
Method #3 : Using count() method and * operator
Python3
# Python3 code to demonstrate working of # Split by repeating substring # initializing string test_str = "gfggfggfggfggfggfggfggfg" # printing original string print ( "The original string is : " + test_str) # initializing target K = 'gfg' # Split by repeating substring re = test_str.count(K) res = [K] * re # printing result print ( "The split string is : " + str (res)) |
The original string is : gfggfggfggfggfggfggfggfg The split string is : ['gfg', 'gfg', 'gfg', 'gfg', 'gfg', 'gfg', 'gfg', 'gfg']
The Time and Space Complexity for all the methods are the same:
Time Complexity: O(n)
Auxiliary Space: O(n)
Method #4:Using loop and slicing
Python3
# initializing string test_str = "gfggfggfggfggfggfggfggfg" # printing original string print ( "The original string is : " + test_str) # initializing target K = 'gfg' # Split by repeating substring using loop and slicing res = [] start = 0 while start < len (test_str): end = start + len (K) if test_str[start:end] = = K: res.append(K) start = end else : start + = 1 # printing result print ( "The split string is : " + str (res)) #This code is contributed by Vinay Pinjala. |
The original string is : gfggfggfggfggfggfggfggfg The split string is : ['gfg', 'gfg', 'gfg', 'gfg', 'gfg', 'gfg', 'gfg', 'gfg']
Time complexity: O(n), The time complexity of this method is linear, as it involves looping through the input string once and performing constant time operations on each character.
Auxiliary Space: O(n), The space complexity of this method is linear, as it involves creating a list of strings that will be the split result. The length of this list will be proportional to the length of the input string.
Method 5 : use the regular expression module re
- Import the ‘re’ module which stands for “regular expressions”. This module provides a way to work with regular expressions in Python.
- Initialize a string ‘test_str’ with some repeated substrings.
- Initialize a target string ‘K’ with a substring we want to split by.
- Use the ‘re.findall()’ method to split the ‘test_str’ string by the target ‘K’ substring. This method returns a list of all non-overlapping matches of the regular expression in the string.
- Store the result of the ‘re.findall()’ method in a variable named ‘res’.
- Print the original string ‘test_str’ using the ‘print()’ function.
- Print the split string ‘res’ using the ‘print()’ function.
- Convert the ‘res’ list to a string using the ‘str()’ function to make it printable.
- Concatenate the string “The original string is : ” with ‘test_str’ using the ‘+’ operator and print the resulting string.
- Concatenate the string “The split string is : ” with the converted ‘res’ string using the ‘+’ operator and print the resulting string.
- The program execution ends here.
Python3
import re # initializing string test_str = "gfggfggfggfggfggfggfggfg" # initializing target K = 'gfg' # Split by repeating substring using re.findall() method res = re.findall(K, test_str) # printing result print ( "The original string is : " + test_str) print ( "The split string is : " + str (res)) |
The original string is : gfggfggfggfggfggfggfggfg The split string is : ['gfg', 'gfg', 'gfg', 'gfg', 'gfg', 'gfg', 'gfg', 'gfg']
The time complexity of this approach is O(n), where n is the length of the input string.
The auxiliary space required is O(k), where k is the number of occurrences of the target substring in the input string.