Let us see how to extract IP addresses from a file using Python.
Algorithm :
- Import the re module for regular expression.
- Open the file using the open() function.
- Read all the lines in the file and store them in a list.
- Declare the pattern for IP addresses. The regex pattern is :
r'(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})'
- For every element of the list search for the pattern using the search() function, store the IP addresses in a list.
- Display the list containing the IP addresses.
The file to be processed is test.txt :
python3
# importing the module import re # opening and reading the file with open ( 'C:/Users/user/Desktop/New Text Document.txt' ) as fh: fstring = fh.readlines() # declaring the regex pattern for IP addresses pattern = re. compile (r '(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})' ) # initializing the list object lst = [] # extracting the IP addresses for line in fstring: lst.append(pattern.search(line)[ 0 ]) # displaying the extracted IP addresses print (lst) |
Output :
The above Python program displays any kind of IP addresses present in the file. We can also display the valid IP addresses.
Rules for a valid IP Address :
- The numbers should be in a range of 0-255
- It should consist of 4 cells separated by ‘.’
The regular expression for valid IP addresses is :
((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)
Explanation of Regular Expression used for valid IP:
Since we cannot use 0-255 range in regular expression we divide the same in 3 groups:
- 25[0-5] – represents numbers from 250 to 255
- 2[0-4][0-9] – represents numbers from 200 to 249
- [01]?[0-9][0-9]?- represents numbers from 0 to 199
The file to be processed is test2.txt :
python3
# importing the module import re # opening and reading the file with open ( 'test2.txt' ) as fh: string = fh.readlines() # declaring the regex pattern for IP addresses pattern = re. compile ( '''((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.) {3}(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)''' ) # initializing the list objects valid = [] invalid = [] # extracting the IP addresses for line in string: line = line.rstrip() result = pattern.search(line) # valid IP addresses if result: valid.append(line) # invalid IP addresses else : invalid.append(line) # displaying the IP addresses print ( "Valid IPs" ) print (valid) print ( "Invalid IPs" ) print (invalid) |
Output :