Given a number of input files in a source directory, write a Python program to read data from all the files and write it to a single master file.
Source directory contains n number of files, and structure is same for all files. The objective of this code is to read all the files one by one and then append the output into a single master file having structure same as source files.
Taking three input files as example, named emp_1.txt, emp_2.txt, emp_3.txt, output will contain data from all the input files.
Input: Output:
Method #1: Using os module
import os # list the files in directory lis = os.listdir( 'D:\\python' '\\data_files\\data_files' ) print (lis) tgt = os.listdir( 'D:\\python' '\\data_files\\target_file' ) file_dir = 'D:\\python\\data_files\\data_files' out_file = r 'D:\\python\\data_files\\target_file\\master.txt' ct = 0 print ( 'target file :' , tgt) try : # check for if file exists # if yes delete the file # otherwise data will be appended to existing file if len (tgt)> 0 : os.remove( 'D:\\python' '\\data_files\\target_file\\master.txt' ) open (tgt, 'a' ).close() else : # create an empty file open (tgt, 'a' ).close() except : head = open ( 'D:\\python' '\\data_files\\target_file\\master.txt' , 'a+' ) line = 'empno, ename, sal' # write header to output print (head, line) head.close() # below loop to write data to output file for line1 in lis: f_dir = file_dir + '\\' + line1 # open files in read mode in_file = open (f_dir, 'r+' ) # open output in append mode w = open (out_file, 'a+' ) d = in_file.readline() d = in_file.readlines() w.write( "\n" ) for line2 in d: print (line2) w.write(line2) ct = ct + 1 w.close() |
Output:
Method #2: Using pandas
import pandas as pd # pd.read_csv creates dataframes df1 = pd.read_csv( 'D:\python\data_files\data_files\emp_1.txt' ) df2 = pd.read_csv( 'D:\python\data_files\data_files\emp_2.txt' ) df3 = pd.read_csv( 'D:\python\data_files\data_files\emp_3.txt' ) frames = [df1, df2, df3] # concat function concatenates the frames result = pd.concat(frames) # to_csv function writes output to file result.to_csv( 'D:\\python\\data_files' '\\target_file\\master.txt' , encoding = 'utf-8' , index = False ) |
Output: