In this article, we will be dealing with the conversion of Excel (.xlsx) file into .csv. There are two formats mostly used in Excel :
- (*.xlsx) : Excel Microsoft Office Open XML Format Spreadsheet file.
- (*.xls) : Excel Spreadsheet (Excel 97-2003 workbook).
Let’s Consider a dataset of a shopping store having data about Customer Serial Number, Customer Name, Customer ID, and Product Cost stored in Excel file.
check all used files here.
Python3
# importing pandas as pd import pandas as pd # read an excel file and convert # into a dataframe object df = pd.DataFrame(pd.read_excel( "Test.xlsx" )) # show the dataframe df |
Output :
Now, let’s see different ways to convert an Excel file into a CSV file :
Method 1: Convert Excel file to CSV file using the pandas library.
Pandas is an open-source software library built for data manipulation and analysis for Python programming language. It offers various functionality in terms of data structures and operations for manipulating numerical tables and time series. It can read, filter, and re-arrange small and large datasets and output them in a range of formats including Excel, JSON, CSV.
For reading an excel file, using the read_excel() method and convert the data frame into the CSV file, use to_csv() method of pandas.
Code:
Python3
#importing pandas as pd import pandas as pd # Read and store content # of an excel file read_file = pd.read_excel ( "Test.xlsx" ) # Write the dataframe object # into csv file read_file.to_csv ( "Test.csv" , index = None , header = True ) # read csv file and convert # into a dataframe object df = pd.DataFrame(pd.read_csv( "Test.csv" )) # show the dataframe df |
Output:
Method 2: Convert Excel file to CSV file using xlrd and CSV library.
xlrd is a library with the main purpose to read an excel file.
csv is a library with the main purpose to read and write a csv file.
Code:
Python3
# import all required library import xlrd import csv import pandas as pd # open workbook by sheet index, # optional - sheet_by_index() sheet = xlrd.open_workbook( "Test.xlsx" ).sheet_by_index( 0 ) # writer object is created col = csv.writer( open ( "T.csv" , 'w' , newline = "")) # writing the data into csv file for row in range (sheet.nrows): # row by row write # operation is perform col.writerow(sheet.row_values(row)) # read csv file and convert # into a dataframe object df = pd.DataFrame(pd.read_csv( "T.csv" )) # show the dataframe df |
Output:
Method 3: Convert Excel file to CSV file using openpyxl and CSV library.
openpyxl is a library to read/write Excel 2010 xlsx/xlsm/xltx/xltm files.It was born from lack of existing library to read/write natively from Python the Office Open XML format.
Code:
Python3
# importe required libraries import openpyxl import csv import pandas as pd # open given workbook # and store in excel object excel = openpyxl.load_workbook( "Test.xlsx" ) # select the active sheet sheet = excel.active # writer object is created col = csv.writer( open ( "tt.csv" , 'w' , newline = "")) # writing the data in csv file for r in sheet.rows: # row by row write # operation is perform col.writerow([cell.value for cell in r]) # read the csv file and # convert into dataframe object df = pd.DataFrame(pd.read_csv( "tt.csv" )) # show the dataframe df |
Output: