Friday, December 27, 2024
Google search engine
HomeLanguagesPython – Extract string between two substrings

Python – Extract string between two substrings

Given a string and two substrings, write a Python program to extract the string between the found two substrings. In Python we have multiple methods and ways to extract the line between two strings.

Input: "Gfg is best for Lazyroar and CS" 
sub1 = "is" '
sub2 = "and"
Output: best for Lazyroar 
Explanation: In the output, substrings we have is the string between the two substring sub1 and sub2.

Get String Between Two Characters in Python

Below are the methods that we will cover in this article:

Extract the String using the index function and a for loop

In this, we get the indices of both the substrings using index(), then a loop is used to iterate within the index to find the required string between them, and then with the help of a loop we extract the string between two substrings

Python3




test_str = "Gfg is best for Lazyroar and CS"
 
# printing original string
print("The original string is : " + str(test_str))
 
# initializing substrings
sub1 = "is"
sub2 = "and"
 
# getting index of substrings
idx1 = test_str.index(sub1)
idx2 = test_str.index(sub2)
 
res = ''
# getting elements in between
for idx in range(idx1 + len(sub1) + 1, idx2):
    res = res + test_str[idx]
 
# printing result
print("The extracted string : " + res)


Output:

The original string is : Gfg is best for Lazyroar and CS
The extracted string : best for Lazyroar 

Time Complexity: O(n)
Auxiliary Space: O(n)

Extract the string using the index function with string slicing

Similar to the above method, just the task of slicing is performed using string slicing for providing a much more compact solution, Here we first find the index value from which we have to start and end the extraction of the string, and now as we have the index of the start and end we can extract the string between the two substrings by using a loop and store the extracted string.

Python3




test_str = "Gfg is best for Lazyroar and CS"
 
# printing original string
print("The original string is : " + str(test_str))
 
# initializing substrings
sub1 = "is"
sub2 = "and"
 
# getting index of substrings
idx1 = test_str.index(sub1)
idx2 = test_str.index(sub2)
 
# length of substring 1 is added to
# get string from next character
res = test_str[idx1 + len(sub1) + 1: idx2]
 
# printing result
print("The extracted string : " + res)


Output:

The original string is : Gfg is best for Lazyroar and CS
The extracted string : best for Lazyroar 

Time Complexity: O(n)
Auxiliary Space: O(n)

Extract substring between two markers using find() and slice()

find() method returns the position of the string passed as an argument or returns -1. Now as we have the index of the start and end we can extract the string between the two substrings by using string slicing

Python3




test_str = "Gfg is best for Lazyroar and CS"
 
# printing original string
print("The original string is : " + str(test_str))
 
# initializing substrings
sub1 = "is"
sub2 = "and"
 
# getting index of substrings
idx1 = test_str.find(sub1)
idx2 = test_str.find(sub2)
 
# length of substring 1 is added to
# get string from next character
res = test_str[idx1 + len(sub1) + 1: idx2]
 
# printing result
print("The extracted string : " + res)


Output:

The original string is : Gfg is best for Lazyroar and CS
The extracted string : best for Lazyroar 

Time Complexity: O(n)
Auxiliary Space: O(n)

Extract string between two substrings with split()

Here we are using the replace and split the original string till we get the desired substrings at the corner and then extract it and print the extracted string

Python3




test_str = "Gfg is best for Lazyroar and CS"
 
# printing original string
print("The original string is : " + str(test_str))
 
# initializing substrings
sub1 = "is"
sub2 = "and"
 
test_str=test_str.replace(sub1,"*")
test_str=test_str.replace(sub2,"*")
re=test_str.split("*")
res=re[1]
 
# printing result
print("The extracted string : " + res)


Output:

The original string is : Gfg is best for Lazyroar and CS
The extracted string : best for Lazyroar 

Time Complexity: O(n)
Auxiliary Space: O(n)

Extract string between two Character using regex

The re.findall() function is used to find all occurrences of the pattern specified by the regular expression in the test string. The regular expression pattern used is s+"(.*)"+e, where s and e are the escaped versions of sub1 and sub2, respectively. The (.*) part of the pattern is a capturing group that matches any characters (except a newline) zero or more times then print the extracted string

Python3




import re
test_str = "Gfg is best for Lazyroar and CS"
 
# printing original string
print("The original string is : " +
      str(test_str))
 
# initializing substrings
sub1 = "is"
sub2 = "and"
 
s=str(re.escape(sub1))
 
e=str(re.escape(sub2))
 
# printing result
res=re.findall(s+"(.*)"+e,test_str)[0]
 
print("The extracted string : " + res)


Output:

The original string is : Gfg is best for Lazyroar and CS
The extracted string :  best for Lazyroar 

Time Complexity: O(n)
Auxiliary Space: O(n)

Using split() and join() to extract string between two substrings

Initialize the input string “test_str” and the two substrings “sub1” and “sub2“.Use the split() function to split the input string at the position of the two substrings, which returns a list with three elements: the substring before sub1, the substring between sub1 and sub2, and the substring after sub2. Use the join() function to concatenate the second element of the list (which is the substring between sub1 and sub2) into a string. Print the extracted string. 

Below is the implementation of the above approach:

Python3




test_str = "Gfg is best for Lazyroar and CS"
 
# printing original string
print("The original string is : " + str(test_str))
 
# initializing substrings
sub1 = "is"
sub2 = "and"
 
# getting elements in between using split() and join()
res = ''.join(test_str.split(sub1)[1].split(sub2)[0])
 
# printing result
print("The extracted string : " + res)


Output

The original string is : Gfg is best for Lazyroar and CS
The extracted string :  best for Lazyroar 

Time complexity: O(n), where n is the length of the input string.
Auxiliary space: O(n), where n is the length of the input string, as we create a new string “res” to store the extracted string.

RELATED ARTICLES

Most Popular

Recent Comments