Given a list of strings and a list of substring. The task is to extract all the occurrences of a substring from the list of strings.
Examples:
Input : test_list = [“gfg is best”, “gfg is good for CS”, “gfg is recommended for CS”]
subs_list = [“gfg”, “CS”]
Output : [‘gfg is good for CS’, ‘gfg is recommended for CS’]
Explanation : Result strings have both “gfg” and “CS”.
Input : test_list = [“gfg is best”, “gfg is recommended for CS”]
subs_list = [“gfg”]
Output : [“gfg is best”, “gfg is recommended for CS”]
Explanation : Result strings have “gfg”.
Method #1 : Using loop + in operator
The combination of the above functions can be used to solve this problem. In this, we run a loop to extract all strings and also all substring in the list. The in operator is used to check for substring existence.
Python3
# Python3 code to demonstrate working of # Strings with all Substring Matches # Using loop + in operator # initializing list test_list = [ "gfg is best" , "gfg is good for CS" , "gfg is recommended for CS" ] # printing original list print ( "The original list is : " + str (test_list)) # initializing Substring List subs_list = [ "gfg" , "CS" ] res = [] for sub in test_list: flag = 0 for ele in subs_list: # checking for non existence of # any string if ele not in sub: flag = 1 break if flag = = 0 : res.append(sub) # printing result print ( "The extracted values : " + str (res)) |
Output:
The original list is : [‘gfg is best’, ‘gfg is good for CS’, ‘gfg is recommended for CS’] The extracted values : [‘gfg is good for CS’, ‘gfg is recommended for CS’]
Time Complexity: O(n2)
Auxiliary Space: O(n)
Method #2 : Using all() + list comprehension
This is a one-liner approach with the help of which we can perform this task. In this, we check for all values existence using all(), and list comprehension is used to iteration of all the containers.
Python3
# Python3 code to demonstrate working of # Strings with all Substring Matches # Using all() + list comprehension # initializing list test_list = [ "gfg is best" , "gfg is good for CS" , "gfg is recommended for CS" ] # printing original list print ( "The original list is : " + str (test_list)) # initializing Substring List subs_list = [ "gfg" , "CS" ] # using all() to check for all values res = [sub for sub in test_list if all ((ele in sub) for ele in subs_list)] # printing result print ( "The extracted values : " + str (res)) |
Output:
The original list is : [‘gfg is best’, ‘gfg is good for CS’, ‘gfg is recommended for CS’] The extracted values : [‘gfg is good for CS’, ‘gfg is recommended for CS’]
Time Complexity: O(n2)
Auxiliary Space: O(n)
Method #3: Using Counter() function
Python3
# Python3 code to demonstrate working of # Strings with all Substring Matches # Using Counter() function from collections import Counter # initializing list test_list = [ "gfg is best" , "gfg is good for CS" , "gfg is recommended for CS" ] # printing original list print ( "The original list is : " + str (test_list)) # initializing Substring List subs_list = [ "gfg" , "CS" ] res = [] for sub in test_list: flag = 0 freq = Counter(sub.split()) for ele in subs_list: # checking for non existence of # any string if ele not in freq.keys(): flag = 1 break if flag = = 0 : res.append(sub) # printing result print ( "The extracted values : " + str (res)) |
The original list is : ['gfg is best', 'gfg is good for CS', 'gfg is recommended for CS'] The extracted values : ['gfg is good for CS', 'gfg is recommended for CS']
Time Complexity: O(n)
Auxiliary Space: O(n)
Method #4: Using set() and intersection() function
Step by step Algorithm:
- Initialize the original list test_list and the list of substrings subs_list.
- Create a new empty list res to store the extracted values.
- Loop through each string sub in test_list.
- Convert sub into a set of words using the split() function.
- Check if subs_list is a subset of the set sub_words using the issubset() function.
- If subs_list is a subset of sub_words, append sub to res.
- After looping through all the strings in test_list, print the extracted values in res.
Python3
test_list = [ "gfg is best" , "gfg is good for CS" , "gfg is recommended for CS" ] subs_list = { "gfg" , "CS" } # printing original list print ( "The original list is : " + str (test_list)) res = [sub for sub in test_list if subs_list.issubset( set (sub.split()))] print ( "The extracted values : " + str (res)) |
The original list is : ['gfg is best', 'gfg is good for CS', 'gfg is recommended for CS'] The extracted values : ['gfg is good for CS', 'gfg is recommended for CS']
Time Complexity: O(n * m * k)
The loop through test_list takes O(n) time, where n is the length of test_list and the conversion of sub into a set of words using the split() function takes O(m) time, where m is the number of words in sub and also the issubset() function takes O(k) time, where k is the number of substrings in subs_list. Thus, the overall time complexity of the algorithm is O(n * m * k).
Space Complexity:
The space used by the res list is O(n*m), where n is the length of test_list and m is the average length of the strings in the list.