Python – Split strings ignoring the space formatting characters

27 July 2024

0

Given a String, Split into words ignoring space formatting characters like \n, \t, etc.

Input : test_str = ‘neveropen\n\r\\nt\t\n\t\tbest\r\tfor\f\vLazyroar’
Output : [‘neveropen’, ‘best’, ‘for’, ‘Lazyroar’]
Explanation : All space characters are used as parameter to join.

Input : test_str = ‘neveropen\n\r\\nt\t\n\t\tbest’
Output : [‘neveropen’, ‘best’]
Explanation : All space characters are used as parameter to join.

Method 1: Using re.split()

In this, we employ appropriate regex composed of space characters and use split() to perform split on set of regex characters.

Python3

# Python3 code to demonstrate working of 
# Split Strings ignoring Space characters
# Using re.split()
import re
 
# initializing string
test_str = 'neveropen\n\r\t\t\nis\t\tbest\r\tfor Lazyroar'
 
# printing original string
print("The original string is : " + str(test_str))
 
# space regex with split returns the result
res = re.split(r'[\n\t\f\v\r ]+', test_str)
     
# printing result 
print("The split string : " + str(res))

Output:

The original string is : neveropen

        
is        best
    for Lazyroar
The split string : ['neveropen', 'is', 'best', 'for', 'Lazyroar']

Time Complexity: O(n)

Auxiliary Space: O(n)

Method 2: Using split()

The split() function by-default splits the string on white-spaces.

Python3

# Python3 code to demonstrate working of 
# Split Strings ignoring Space characters
# Using split()
 
# initializing string
test_str = 'neveropen\n\r\t\t\nis\t\tbest\r\tfor Lazyroar'
 
# printing original string
print("The original string is : " + str(test_str))
     
# printing result 
print("The split string : " + str(test_str.split()))

Output:

The original string is : neveropen

        
is        best
    for Lazyroar
The split string : ['neveropen', 'is', 'best', 'for', 'Lazyroar']

Time Complexity: O(n)

Auxiliary Space: O(n)

Approach#3: Using string.split() method with filter()

Use the string.split() method to split the input string into substrings.
Use the filter() function to remove any empty strings from the resulting list of substrings.
Return the filtered list of substrings.

Python3

# Python program for the above approach
 
# Function to split the string
def split_string(test_str):
    substrings = test_str.split()
    substrings = list(filter(lambda s: s.strip(), substrings))
    return substrings
 
# Driver Code
test_str = 'neveropen\n\r\t\t\nis\t\tbest\r\tfor Lazyroar'
print(split_string(test_str))

Output

['neveropen', 'is', 'best', 'for', 'Lazyroar']

Time Complexity: O(n), where n is the length of the input string. The split() method takes linear time in the length of the string.

Space Complexity: O(n), where n is the length of the input string. The space used by the resulting list of substrings is proportional to the length of the input string.

Approach#4

Method 4 : use the itertools module to group contiguous non-space characters together and then join them into separate substrings.

Steps :

Import the itertools module to work with iterators and grouping functions.
Use the itertools.groupby() function to group contiguous non-space characters in the input string.
Use a list comprehension to join the characters in each group into separate substrings.
Print the resulting list of substrings

Python3

import itertools
 
# initializing string
test_str = 'neveropen\n\r\t\t\nis\t\tbest\r\tfor Lazyroar'
 
 
# splitting string using itertools module
result = [''.join(group) for is_space, group in itertools.groupby(test_str, lambda x: x.isspace()) if not is_space]
 
# printing result
print("The split string : " + str(result))

Output

The split string : ['neveropen', 'is', 'best', 'for', 'Lazyroar']

Time complexity: The itertools.groupby() function has a linear time complexity in the length of the input string, so this approach has a time complexity of O(n), where n is the length of the input string.

Auxiliary space: This approach creates a list to store the resulting substrings, so it has an auxiliary space complexity of O(n), where n is the length of the input string.

Python – Split strings ignoring the space formatting characters

Python3

Python3

Python3

Python3

Java Program for Longest Common Subsequence

Maximum height of Tree when any Node can be considered as Root

Print Fibonacci sequence using 2 variables

LEAVE A REPLY Cancel reply

Most Popular

Samsung offers free screen replacements for users still suffering green line issues

7 Best Free Antiviruses for Mac in 2024: Are They Any Good? by Katarina Glamoslija

Is Microsoft Teams Secure? Use Teams Safely in 2024 by Tyler Cross

Interview With Willem Dewulf – CEO of ProBackup by Shauli Zacks

Recent Comments

EDITOR PICKS

Samsung offers free screen replacements for users still suffering green line issues

7 Best Free Antiviruses for Mac in 2024: Are They Any Good? by Katarina Glamoslija

Is Microsoft Teams Secure? Use Teams Safely in 2024 by Tyler Cross

POPULAR POSTS

Samsung offers free screen replacements for users still suffering green line issues

7 Best Free Antiviruses for Mac in 2024: Are They Any Good? by Katarina Glamoslija

Is Microsoft Teams Secure? Use Teams Safely in 2024 by Tyler Cross

POPULAR CATEGORY

ABOUT US

FOLLOW US