In this article, we will see how to parse XML and count instances of a particular node attribute in Python.
What is XML?
Extensible Markup Language (XML) Extensible Markup Language (XML) is a markup language that defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. It is a markup language like HTML, and It is designed to store and transport data. Here, we will use built-in XML modules in python for parsing XML and then counting the instances of a node. We use ElementTree XML API and minidom API to parse our XML file.
XML code for a note is given below:
It should be saved as a country_data.xml file in the same directory.
XML
<? xml version = "1.0" ?> < data > < country name = "France" > < rank >1</ rank > < year >2008</ year > < gdppc >141100</ gdppc > < neighbor name = "Germany" direction = "E" /> < neighbor name = "Spain" direction = "N" /> </ country > < country name = "Poland" > < rank >4</ rank > < year >2011</ year > < gdppc >59900</ gdppc > < neighbor name = "Germany" direction = "W" /> </ country > < country name = "Italy" > < rank >68</ rank > < year >2015</ year > < gdppc >13600</ gdppc > < neighbor name = "France" direction = "N" /> </ country > </ data > |
Example 1:
In this example, We will use xml.etree.ElementTree module for parsing our XML file and storing in tree variable and after that we will find all the instances of a particular node attribute with the python findall() function of this module. Now we iterate over a list and check for a particular node attribute value if it matches then we will increment count as 1 to our variable.
Python3
# Importing our module import xml.etree.ElementTree as ET # Finding the Node Attribute with name tag # neighbor and name value as "Germany" Name_attribute = "France" ; # Parsing our xml file tree = ET.parse( 'country_data.xml' ) root = tree.getroot(); # Counting the instance of Node attribute with findall NO_node = 0 ; for instance in root.findall( 'country/neighbor' ): # Checking for the particular Node Attribute if instance.get( 'name' ) = = Name_attribute: NO_node + = 1 ; # Printing Number of nodes print ( "total instance of given node attribute is : " , NO_node) |
Output:
total instance of given node attribute is : 1
Example 2:
In this example, we will parse our XML file with the help of minidom module and assign this to the doc variable, getElementsByTagName() function returns a list of instances of a particular node. Now we iterate over a list and check for a particular node attribute value if it matches then we will increment count as 1 to our variable.
Python3
# Importing our module from xml.dom import minidom # Finding the node instance with name "Germany" Name_attribute = "Germany" ; # Parsing our xml file doc = minidom.parse( 'country_data.xml' ) root = doc.getElementsByTagName( 'neighbor' ) Number_attributes = 0 ; for i in root: # print ctypes.cast(i, ctypes.py_object).value if i.attributes[ 'name' ].value = = Name_attribute: Number_attributes + = 1 ; # Printing Number of nodes print ( "Total instance of Particular node attribute is : " ,Number_attributes) |
Output:
Total instance of Particular node attribute is : 2