Metacharacters are considered as the building blocks of regular expressions. Regular expressions are patterns used to match character combinations in the strings. Metacharacter has special meaning in finding patterns and are mostly used to define the search criteria and any text manipulations.
Some of the mostly used metacharacters along with their uses are as follows:
Meta Character | Description | Example |
---|---|---|
\d | whole numbers( 0-9 )(single digit) | \d = 7, \d\d=77 |
\w | alphanumeric character |
\w\w\w\w = geek \w\w\w =! geek |
* | 0 or more characters | s* = _,s,ss,sss,ssss….. |
+ | 1 or more characters | s+ = s,ss,sss,ssss….. |
? | 0 or 1 character | s? = _ or s |
{m} | occurs “m” times | sd{3} = sddd |
{m,n} | min “m” and max “n” times | sd{2,3}=sdd or sddd |
\W | symbols | \W = % |
[a-z] or [0-9] | character set |
geek[sy] = geeky geek[sy] != geeki |
Regular expressions can be built by metacharacters, and patterns can be processed using a library in Python for Regular Expressions known as “re”.
import re # used to import regular expressions
The inbuilt library can be used to compile patterns, find patterns, etc.
Example: In the below code, we will generate all the patterns based on the given regular expression
Python3
import re ''' Meta characters - * - 0 or more + - 1 or more ? - 0 or 1 {m} - m times {m,n} - min m and max n ''' test_phrase = 'sddsd..sssddd...sdddsddd...dsds...dsssss...sdddd' test_patterns = [r 'sd*' , # s followed by zero or more d's r 'sd+' , # s followed by one or more d's r 'sd?' , # s followed by zero or one d's r 'sd{3}' , # s followed by three d's r 'sd{2,3}' , # s followed by two to three d's ] def multi_re_find(test_patterns, test_phrase): for pattern in test_patterns: compiledPattern = re. compile (pattern) print ( 'finding {} in test_phrase' . format (pattern)) print (re.findall(compiledPattern, test_phrase)) multi_re_find(test_patterns, test_phrase) |
Output:
finding sd* in test_phrase
[‘sdd’, ‘sd’, ‘s’, ‘s’, ‘sddd’, ‘sddd’, ‘sddd’, ‘sd’, ‘s’, ‘s’, ‘s’, ‘s’, ‘s’, ‘s’, ‘sdddd’]
finding sd+ in test_phrase
[‘sdd’, ‘sd’, ‘sddd’, ‘sddd’, ‘sddd’, ‘sd’, ‘sdddd’]
finding sd? in test_phrase
[‘sd’, ‘sd’, ‘s’, ‘s’, ‘sd’, ‘sd’, ‘sd’, ‘sd’, ‘s’, ‘s’, ‘s’, ‘s’, ‘s’, ‘s’, ‘sd’]
finding sd{3} in test_phrase
[‘sddd’, ‘sddd’, ‘sddd’, ‘sddd’]
finding sd{2,3} in test_phrase
[‘sdd’, ‘sddd’, ‘sddd’, ‘sddd’, ‘sddd’]