DOM (document object model) is a cross-language API from W3C i.e. World Wide Web Consortium for accessing and modifying XML documents. Python enables you to parse XML files with the help of xml.dom.minidom, which is the minimal implementation of the DOM interface. It is simpler than the full DOM API and should be considered as smaller.
Steps for Parsing XML are –
- Import the module
import xml.dom.minidom
Let say, your XML files will have the following things,
- Use the parse function to load and parse the XML file. In the below case docs stores the result of the parse function
docs = xml.dom.minidom.parse("test.xml")
- Let’s print the child tagname and nodename of the XML file.
Python3
import xml.dom.minidom docs = xml.dom.minidom.parse( "test.xml" ) print (docs.nodeName) print (docs.firstChild.tagName) |
Output:
#document info
- Now to get the information from the tag-name, you need to call dom standard function getElementsByTagName and getAttribute for fetching the required attributes.
Python3
import xml.dom.minidom docs = xml.dom.minidom.parse( "test.xml" ) print (docs.nodeName) print (docs.firstChild.tagName) skills = docs.getElementsByTagName( "skills" ) print ( "%d skills" % skills.length) for i in skills: print (i.getAttribute( "name" )) |
Output:
#document info 4 skills Machine learning Deep learning Python Bootstrap