Python provides inbuilt functions for creating, writing, and reading files. Two types of files can be handled in python, normal text files, and binary files (written in binary language,0s and 1s).
- Text files: In this type of file, Each line of text is terminated with a special character called EOL (End of Line), which is the new line character (‘\n’) in python by default.
- Binary files: In this type of file, there is no terminator for a line, and the data is stored after converting it into machine-understandable binary language.
Here we are operating on the .txt file in Python. Through this program, we will find the most repeated word in a file.
Approach:
- We will take the content of the file as input.
- We will save each word in a list after removing spaces and punctuation from the input string.
- Find the frequency of each word.
- Print the word which has a maximum frequency.
Input File:
Below is the implementation of the above approach:
Python3
# Python program to find the most repeated word # in a text file # A file named "gfg", will be opened with the # reading mode. file = open ( "gfg.txt" , "r" ) frequent_word = "" frequency = 0 words = [] # Traversing file line by line for line in file : # splits each line into # words and removing spaces # and punctuations from the input line_word = line.lower().replace( ',' ,' ').replace(' . ',' ').split( " " ); # Adding them to list words for w in line_word: words.append(w); # Finding the max occurred word for i in range ( 0 , len (words)): # Declaring count count = 1 ; # Count each word in the file for j in range (i + 1 , len (words)): if (words[i] = = words[j]): count = count + 1 ; # If the count value is more # than highest frequency then if (count > frequency): frequency = count; frequent_word = words[i]; print ( "Most repeated word: " + frequent_word) print ( "Frequency: " + str (frequency)) file .close(); |
Output:
Most repeated word: well Frequency: 3