In this article, we will learn how to do Conversion of CSV to PDF file format. This simple task can be easily done using two Steps :
- Firstly, We convert our CSV file to HTML using the Pandas
- In the Second Step, we use PDFkit Python API to convert our HTML file to the PDF file format.
Approach:
1. Converting CSV file to HTML using Pandas Framework.
Pandas is a fast, powerful, flexible, and easy-to-use open-source data analysis and manipulation tool, built on top of the Python programming language.
CSV File Used:
For this section of tutorial we will be using :
- pandas.read_csv(): read_csv is an important pandas function to read CSV files and do operations on it.We will be using it to read our input CSV file.
- .to_html(): With help of DataFrame.to_html() method, we can get the html format of a dataframe by using DataFrame.to_html() method.This function takes in a CSV file as input, converts it, and saves it locally in HTML file format.
Syntax for converting CSV to HTML using Pandas :
import pandas as pd
CSV = pd.read_csv(“MyCSV.csv”)
CSV.to_html(“MyCSV.html”)
HTML File Used: MyCSV
2. Converting HTML file to CSV using PDFKit Python API
There are many approaches for generating PDF in python. pdfkit is one of the better approaches as, it renders HTML into PDF with various image formats, HTML forms, and other complex printable documents.
We can create a PDF document with pdfkit in 3 ways. They are :
- from URL
- from a HTML file
- from the string.
2.1. Generate PDF from URL: The following script gives us the pdf file from a website URL.
import pdfkit pdfkit.from_url('https://www.geeksforgeeks.org', 'Output.pdf')
2.2. Generate PDF from file: The following script gives us the pdf file from an HTML file.
import pdfkit pdfkit.from_file('LocalHTMLFile.html', 'Output.pdf')
2.3. Generate PDF from the string: The following script gives us the pdf file from a string.
import pdfkit pdfkit.from_string('Geeks For Geeks', 'Output.pdf')
Since we have already converted our CSV file to HTML we will be using the first method i.e. Generating PDF from URL wherein either we can give any website’s address or any local HTML file.
If one already have wkhtmltopdf installed on machine we may use this syntax directly :
Syntax for converting HTML to PDF using PDFKit :
import pdfkit
pdfkit.from_url(“MyCSV.html”, “FinalOutput.pdf”)
Else, we also need to install wkhtmltopdf for the script to run on our PC and set the installed file wkhtmltopdf.exe ‘s path to our PC’s Environment Variables and we can now skip the configuration section in the script.
or
We can alternatively set Configuration as shown for the installed wkhtmltopdf.exe file and pass on the config variable to pdfkit.from_url function :
Path Configuration
path_wkhtmltopdf = r’D:\Softwares\wkhtmltopdf\bin\wkhtmltopdf.exe’
config = pdfkit.configuration(wkhtmltopdf=path_wkhtmltopdf)
Convert HTML file to PDF with pdfkit
pdfkit.from_url(“MyCSV.html”, “FinalOutput.pdf”, configuration=config)
Implementation:
Initial files in the folder
Python
import pandas as pd import pdfkit # SAVE CSV TO HTML USING PANDAS csv = 'MyCSV.csv' html_file = csv_file[: - 3 ] + 'html' df = pd.read_csv(csv_file, sep = ',' ) df.to_html(html_file) # INSTALL wkhtmltopdf AND SET PATH IN CONFIGURATION # These two Steps could be eliminated By Installing wkhtmltopdf - # - and setting it's path to Environment Variables path_wkhtmltopdf = r 'D:\Softwares\wkhtmltopdf\bin\wkhtmltopdf.exe' config = pdfkit.configuration(wkhtmltopdf = path_wkhtmltopdf) # CONVERT HTML FILE TO PDF WITH PDFKIT pdfkit.from_url( "MyCSV.html" , "FinalOutput.pdf" , configuration = config) |
After Running Above Python Script :
Final Output :