Friday, December 27, 2024
Google search engine
HomeLanguagesPython – Wildcard Substring search

Python – Wildcard Substring search

Sometimes, while working with Python Strings, we have problem in which, we need to search for substring, but have some of characters missing and we need to find the match. This can have application in many domains. Lets discuss certain ways in which this task can be performed. 

Method #1 : Using re.search() This is one of the way in which this task can be performed. In this, we feed the regex compile with the substring and search for it using main string in search(). 

Python3




# Python3 code to demonstrate working of
# Wildcard Substring search
# Using re.search()
import re
     
# initializing string
test_str = 'neveropen is best for Lazyroar'
 
# printing original string
print("The original string is : " + str(test_str))
 
# initializing Substring
sub_str = '..st'
 
# Wildcard Substring search
# Using re.search()
temp = re.compile(sub_str)
res = temp.search(test_str)
 
# printing result
print("The substring match is : " + str(res.group(0)))


Output : 

The original string is : neveropen is best for Lazyroar                                                                
The substring match is : best     

  Method #2 : Using re.finditer() This is yet another way to solve this problem. In this, we can also extract the position of match if required. 

Python3




# Python3 code to demonstrate working of
# Wildcard Substring search
# Using re.finditer()
import re
     
# initializing string
test_str = 'neveropen is best for Lazyroar'
 
# printing original string
print("The original string is : " + str(test_str))
 
# initializing Substring
sub_str = '..st'
 
# Wildcard Substring search
# Using re.finditer()
temp = re.compile(sub_str)
res = temp.search(test_str)
 
# printing result
print("The substring match is : " + str(res.group(0)))


Output : 

The original string is : neveropen is best for Lazyroar                                                                
The substring match is : best     

The Time and Space Complexity for all the methods are the same:

Time Complexity: O(n)

Space Complexity: O(n)

Method #3 :  Here’s another approach that uses the re library and the re.findall method. It is similar to the re.search method, but re.findall returns a list of all non-overlapping matches as separate strings.

Python3




import re
 
# initializing string
test_str = 'neveropen is best for Lazyroar'
 
# printing original string
print("The original string is : " + str(test_str))
 
# initializing Substring
sub_str = '..st'
 
# Wildcard Substring search using re.findall()
temp = re.compile(sub_str)
res = temp.findall(test_str)
 
# printing result
print("The substring match(es) are: " + str(res))
#This code is contributed by Edula Vinay Kumar Reddy


Output

The original string is : neveropen is best for Lazyroar
The substring match(es) are: ['best']

The time and space complexity for this approach is the same as the previous methods:

Time Complexity: O(n)
Auxiliary Space: O(n)

Method #4 : Using split(),replace(),endswith(),join() methods

Approach 

  1. Splitted the given string using split()
  2. Replaced . with empty string in sub_str(using replace())
  3. Initiated a for loop to traverse the list
  4. Checked whether the length of element is equal to initial substring length and whether the element endswith replaced substring(using endswith())
  5. If True append such elements to output list
  6. Finally joined the output and displayed it(using join())

Python3




# Python3 code to demonstrate working of
# Wildcard Substring search
 
# initializing string
test_str = 'neveropen is best for Lazyroar'
 
# printing original string
print("The original string is : " + str(test_str))
 
# initializing Substring
sub_str = '..st'
 
# Wildcard Substring search
x=test_str.split()
res=[]
y=len(sub_str)
sub_str=sub_str.replace(".","")
for i in x:
    if len(i)==y and i.endswith(sub_str):
        res.append(i)
 
# printing result
print("The substring match is : " + "".join(res))


Output

The original string is : neveropen is best for Lazyroar
The substring match is : best

Time Complexity: O(n)
Auxiliary Space: O(n)

Method #5 : Using string slicing and a list comprehension

1. Initialize the string and the pattern to search for.
2. Print the original string.
3. Use a list comprehension to create a list of all substrings that match the pattern.
  a. For each possible starting index i in the original string, extract a substring of length equal to the length of the pattern.
  b. Check if each character in the substring matches the corresponding character in the pattern, or if the pattern character is a wildcard “.”.
  c. If all characters match or are wildcards, add the substring to the list of matches.
4. If there is at least one match, print the first one.
 

Python3




test_str = 'neveropen is best for Lazyroar'
sub_str = '..st'
 
# printing original string
print("The original string is : " + str(test_str))
 
# Use a list comprehension to create a list of all substrings that match the pattern
# The substring must be the same length as the pattern, and each character must match the pattern character or be a wildcard "."
matches = [test_str[i:i+len(sub_str)] for i in range(len(test_str) - len(sub_str) + 1) if all(a == b or b == '.' for a, b in zip(test_str[i:i+len(sub_str)], sub_str))]
 
# If there is at least one match, print the first one
if len(matches) > 0:
    print("The substring match is : " + matches[0])
#This code is contributed by Jyothi pinjala.


Output

The original string is : neveropen is best for Lazyroar
The substring match is : best

The time complexity : O(nm), where n is the length of the original string test_str and m is the length of the substring sub_str. The code uses a list comprehension to iterate over all possible starting indices of the substring and check if the characters match the pattern using the zip function. Since the zip function has a time complexity of O(m) and the list comprehension is repeated n-m+1 times, the overall time complexity is O(nm).

The auxiliary space :O(k), where k is the number of substring matches found. This is because the code stores all matching substrings in the matches list. Since the maximum number of matching substrings is limited to the number of possible starting indices, which is n-m+1, the space complexity is O(n-m+1), which is equivalent to O(n).

Method #6: Using list comprehension and string comparison without zip()

Steps:

  • Use a list comprehension to iterate through the input string and generate a list of substrings that match the given wildcard pattern. In this case, we use string slicing and endswith() method for comparison.
  • Extract the matched substring from the list of matches. We take the first match from the list, if any, otherwise set the result to None.
  • Print the result.

Python3




# Python3 code to demonstrate working of
# Wildcard Substring search
# Using list comprehension and string comparison
 
# initializing string
test_str = 'neveropen is best for Lazyroar'
 
# printing original string
print("The original string is : " + str(test_str))
 
# initializing Substring
sub_str = '..st'
 
# Wildcard Substring search
# Using list comprehension and string comparison
matches = [test_str[i:i+len(sub_str)] for i in range(len(test_str)-len(sub_str)+1) if test_str[i:i+len(sub_str)].endswith(sub_str[-1])]
res = matches[0] if matches else None
 
# printing result
print("The substring match is : " + str(res))


Output

The original string is : neveropen is best for Lazyroar
The substring match is : best

Time Complexity: O(n), where n is the length of the test string.
Auxiliary Space: O(1).

RELATED ARTICLES

Most Popular

Recent Comments