Lookahead is used as an assertion in Python regular expressions to determine success or failure whether the pattern is ahead i.e to the right of the parser’s current position. They don’t match anything. Hence, they are called as zero-width assertions.
Syntax:
# Positive lookahead (?=<lookahead_regex>)
Example 1:
Python3
# importing regex import re # lookahead example example = re.search(r 'Lazyroar(?=[a-z])' , "neveropen" ) # display output print ( "Pattern:" , example.group()) print ( "Pattern found from index:" , example.start(), "to" , example.end()) |
Output:
Pattern: Lazyroar Pattern found from index: 0 to 5
The lookahead assertion (?=[a-z]) specifies that what follows Lazyroar must be a lowercase alphabetic character. In this case, it’s the character f, a match is found.
Example 2:
Python3
# importing regex import re # Lookahead example example = re.search(r 'Lazyroar(?=[a-z])' , "Lazyroar123" ) # output print (example) |
Output:
None
In the above example, the output is None because the next character after Lazyroar is 1. It is not a lowercase alphabetic character.
Lookahead portion is not part of the search string. Hence, it is termed as zero width assertion. They are important when you don’t want the output to return lookahead portion present in search string but want to use it to match pattern which is followed by a particular section. Below example will make this clear.
Example 3:
Python3
# import required module import re # using lookahead example1 = re.search(r 'Lazyroar(?=[a-z])' , "neveropen" ) print ( 'Using lookahead:' , example1.group()) # without using lookahead example2 = re.search(r 'Lazyroar([a-z])' , "neveropen" ) print ( 'Without using lookahead:' , example2.group()) |
Output:
Using lookahead: Lazyroar Without using lookahead: Lazyroarf
Using lookahead the output generated is ‘Lazyroar’ whereas without using lookahead the output generated is Lazyroarf. The f is consumed by regex and it becomes part of the search string.
Negative lookahead is opposite of lookahead. It is to assure that the search string is not followed by <lookahead_regex>.
Syntax:
# Negative Lookahead (?!<lookahead_regex>)
Example 4:
Python3
# import required module import re # positive lookahead example1 = re.search( 'Lazyroar(?=[a-z])' , 'neveropen' ) print ( 'Positive Lookahead:' , example1.group()) # negative lookahead example2 = re.search( 'Lazyroar(?![a-z])' , 'Lazyroar123' ) print ( 'Negative Lookahead:' , example2.group()) |
Output:
Positive Lookahead: Lazyroar Negative Lookahead: Lazyroar
In the above example, the output is Lazyroar because search string Lazyroar here is not followed by lowercase letters.