Python proves to be a powerful language when the requirement is to convert a file from one format to the other. It supports tools that can be employed to easily achieve the functionality. In this article, we’ll find out how we will convert from an Excel file to Extensible terminology (XML) files with Python.
Modules needed
- OpenPyXL helps in interacting with Excel files. It can read and write to .xlsx and .xlsm files and can be installed as:
pip install openpyxl
- Yattag is a Python library for generating HTML or XML documents with Python in a very readable way. This Yattag Library is pretty simple and easy to use library. If you are searching for any library in order to more easily generate HTML or XML documents.
pip install yattag
Function needed
- To load the contents of the Excel file load_workbook() method of OpenPyXl is used.
- To iterate through loaded file and read data Iter_rows() with appropriate attributes is used
Syntax: Iter_rows(min_col, min_row, max_col, max_row, values_only)
Parameters:
- min_col (int) – smallest column value (1-based index)
- min_row (int) – smallest row value (1-based index)
- max_col (int) – largest column value (1-based index)
- Max_row (int) – largest row value (1-based index)
- values_only (bool) – whether only cell values should be returned
- The tagtext() method is a helper method that returns a triplet composed of:
- The Doc instance itself
- The tag method of the Doc instance
- The text method of the Doc instance
- The asis method appends a string to the document without any form of escaping.
- The tag method will accept any string as a tag name.
- The indent function takes a string representing an XML or HTML document and returns a well-indented version of this document.
Database in use: Click here
To convert Excel data to XML first, it needs to be read, the given program explains the mechanism for reading data.
Approach
- Import module
- Load Excel file
- Create sheet object
- Iterate through rows
Example
Python3
# Install the openpyxl library from openpyxl import load_workbook # Loading our Excel file wb = load_workbook( "demo_database.xlsx" ) # creating the sheet 1 object ws = wb.worksheets[ 0 ] # Iterating rows for getting the values of each row for row in ws.iter_rows(min_row = 1 , max_row = 2 , min_col = 1 , max_col = 6 ): print ([cell.value for cell in row]) |
Now, Once we are done with Reading data. Let’s Code how to convert Excel to XML format,
Approach:
- Import module
- Read data
- Create XML format page
- Append to file
- Save file
Example:
Python3
from openpyxl import load_workbook from yattag import Doc, indent # Load our Excel File wb = load_workbook( "demo_database.xlsx" ) # Getting an object of active sheet 1 ws = wb.worksheets[ 0 ] # Returning returns a triplet doc, tag, text = Doc().tagtext() xml_header = '<?xml version="1.0" encoding="UTF-8"?>' # Appends the String to document doc.asis(xml_header) doc.asis(xml_schema) with tag( 'People' ): for row in ws.iter_rows(min_row = 2 , max_row = 10 , min_col = 1 , max_col = 6 ): row = [cell.value for cell in row] with tag( "Person" ): with tag( "First_Name" ): text(row[ 0 ]) with tag( "Last_Name" ): text(row[ 1 ]) with tag( "Gender" ): text(row[ 2 ]) with tag( "Country" ): text(row[ 3 ]) with tag( "Age" ): text(row[ 4 ]) with tag( "Date" ): text(row[ 5 ]) result = indent( doc.getvalue(), indentation = ' ' , indent_text = True ) with open ( "output.xml" , "w" ) as f: f.write(result) |
Output: output.xml