Sunday, November 17, 2024
Google search engine
HomeLanguagesPython Regex Metacharacters

Python Regex Metacharacters

Metacharacters are considered as the building blocks of regular expressions. Regular expressions are patterns used to match character combinations in the strings. Metacharacter has special meaning in finding patterns and are mostly used to define the search criteria and any text manipulations.

Some of the mostly used metacharacters along with their uses are as follows:

Meta Character     Description              Example          
         \d whole numbers( 0-9 )(single digit)  \d = 7,  \d\d=77
         \w alphanumeric character

\w\w\w\w = geek

\w\w\w =! geek

          * 0 or more characters s*  = _,s,ss,sss,ssss…..
          + 1 or more characters s+ = s,ss,sss,ssss…..
          ?   0 or 1 character s?  = _ or s
        {m} occurs “m” times sd{3} = sddd
       {m,n} min “m” and max “n” times sd{2,3}=sdd or sddd
        \W   symbols  \W = %
[a-z]  or [0-9] character set

geek[sy] = geeky

geek[sy] != geeki

Regular expressions can be built by metacharacters, and patterns can be processed using a library in Python for Regular Expressions known as “re”.

import re  # used to import regular expressions

The inbuilt library can be used to compile patterns, find patterns, etc. 

Example: In the below code, we will generate all the patterns based on the given regular expression

Python3




import re
  
  
'''
Meta characters - 
* - 0 or more
+ - 1 or more
? - 0 or 1
{m} - m times
{m,n} - min m and max n
'''
  
test_phrase = 'sddsd..sssddd...sdddsddd...dsds...dsssss...sdddd'
test_patterns = [r'sd*',        # s followed by zero or more d's
                 r'sd+',          # s followed by one or more d's
                 r'sd?',          # s followed by zero or one d's
                 r'sd{3}',        # s followed by three d's
                 r'sd{2,3}',      # s followed by two to three d's
                 ]
  
  
def multi_re_find(test_patterns, test_phrase):
    for pattern in test_patterns:
        compiledPattern = re.compile(pattern)
        print('finding {} in test_phrase'.format(pattern))
        print(re.findall(compiledPattern, test_phrase))
  
  
multi_re_find(test_patterns, test_phrase)


Output:

finding sd* in test_phrase
[‘sdd’, ‘sd’, ‘s’, ‘s’, ‘sddd’, ‘sddd’, ‘sddd’, ‘sd’, ‘s’, ‘s’, ‘s’, ‘s’, ‘s’, ‘s’, ‘sdddd’]
finding sd+ in test_phrase
[‘sdd’, ‘sd’, ‘sddd’, ‘sddd’, ‘sddd’, ‘sd’, ‘sdddd’]
finding sd? in test_phrase
[‘sd’, ‘sd’, ‘s’, ‘s’, ‘sd’, ‘sd’, ‘sd’, ‘sd’, ‘s’, ‘s’, ‘s’, ‘s’, ‘s’, ‘s’, ‘sd’]
finding sd{3} in test_phrase
[‘sddd’, ‘sddd’, ‘sddd’, ‘sddd’]
finding sd{2,3} in test_phrase
[‘sdd’, ‘sddd’, ‘sddd’, ‘sddd’, ‘sddd’]

RELATED ARTICLES

Most Popular

Recent Comments