Given a text file, read the content of that text file line by line and print only those lines which do not start with a defined prefix. Also, store those printed lines in another text file. There are the following ways in Python in which this task can be done
Program to remove lines starting with any prefix using Python
Below are the methods that we will cover in this article:
- Using loop and startswith() Method
- Using Regex Module
- Using find() Method
Remove lines Starting with a Prefix using the loop and startswith() method
In this method, we read the contents of the file line by line. While reading, we check if the line begins with the given prefix, we simply skip that line and print it. Also, store that line in another text file.
Suppose the text file from which lines should be read is given below:
Example: In the example, we open a file and read its content line by line. We check if that line begins with a given prefix using the startswith() method. If that line begins with “TextGenerator” we skip that line, else we print the line and store it in another file. In this way, we could remove lines starting with the specified prefix.
Python3
file1 = open ( 'Lazyroar.txt' , 'r' ) # defining object file2 to open LazyroarUpdated file in write mode file2 = open ( 'LazyroarUpdated.txt' , 'w' ) # reading each line from original text file for line in file1.readlines(): # reading all lines that do not begin with "TextGenerator" if not (line.startswith( 'TextGenerator' )): # printing those lines print (line) # storing only those lines that # do not begin with "TextGenerator" file2.write(line) # close and save the files file2.close() file1.close() |
Output:
Remove lines Starting with a Prefix using Regex Module
In this method of regex, we use re module of Python which offers a set of metacharacters. Metacharacters are characters with special meaning. To remove lines starting with specified prefixes, we use “^” (Starts with) metacharacter. We also make use of re.findall() function which returns a list containing all matches.
Example 1: In the example, we open a file and read its content line by line. We check if that line begins with “Geeks” using regular expression. If that line begins with “Geeks” we skip that line, and we print the rest of the lines and store those lines in another file.
Python3
import re # open Lazyroar file in read mode file1 = open ( 'Lazyroar.txt' , 'r' ) # defining object file2 to open # LazyroarUpdated file in # write mode file2 = open ( 'LazyroarUpdated.txt' , 'w' ) # reading each line from original # text file for line in file1.readlines(): # reading all lines that begin # with "TextGenerator" x = re.findall( "^Geeks" , line) if not x: # printing those lines print (line) # storing only those lines that # do not begin with "TextGenerator" file2.write(line) # close and save the files file1.close() file2.close() |
Output:
Remove lines Starting with a Prefix using the find() method
The find()
method is a string method in Python that can be used to find the index of the first occurrence of a substring within another string. It returns the index of the substring or -1 if the substring is not found. While the find()
the method itself does not remove lines, it can be combined with other techniques to achieve the desired outcome.
Consider the Lazyroar.txt which has the below content:
TextGenerator is dummy
Hi Hello
After running the below code the output is Hi Hello
And the file LazyroarUpdated.txt has the below content
Hi Hello
Python3
# Open Lazyroar file in read mode file1 = open ( 'Lazyroar.txt' , 'r' ) #open LazyroarUpdated file in write mode file2 = open ( 'LazyroarUpdated.txt' , 'w' ) # reading each line from original text file for line in file1.readlines(): # reading all lines that do not begin with "TextGenerator" if not (line.find( 'TextGenerator' ) = = 0 ): # printing those lines print (line) # storing only those lines that do not begin with "TextGenerator" file2.write(line) # close and save the files file2.close() file1.close() |
Output :
Hi Hello