As we know XML document is used to store and transport data. So, to access data from XML, we need something which could access each node and respective attributes data. Then the solution is XPath. XPath can be used to traverse through XML documents, to select nodes/elements, and attribute data. It is a W3C Recommendation and a flexible way of accessing different parts of an XML document. To write XPath is similar to writing path expression in your computer system to traverse to a specific location like (C:/School/Homework/assignment.docx).
Consider the Following XML Document
XML
<? xml version = "1.0" encoding = "UTF-8" ?> < students > < student branch = "CSE" > < name >Divyank Singh Sikarwar</ name > < age >18</ age > < city >Agra</ city > </ student > < student branch = "CSE" > < name >Aniket Chauhan</ name > < age >20</ age > < city >Shahjahanpur</ city > </ student > < student branch = "CSE" > < name >Simran Agarwal</ name > < age >23</ age > < city >Buland Shar</ city > </ student > < student branch = "CSE" > < name >Abhay Chauhan</ name > < age >17</ age > < city >Shahjahanpur</ city > </ student > < student branch = "IT" > < name >Himanshu Bhatia</ name > < age >25</ age > < city >Indore</ city > </ student > < student branch = "IT" > < name >Anuj Modi</ name > < age >22</ age > < city >Ahemdabad</ city > </ student > < student branch = "ECE" > < name >Manoj Yadav</ name > < age >23</ age > < city >Kota</ city > </ student > </ students > |
XPath symbols that are used to access different parts of an XML document:
Symbol |
Description |
Example |
Result |
---|---|---|---|
name | Selects all tags from XML having name ‘name’ | /students/student/name | Displays all names |
/ | This represents the root of the document | /students/student/city | Display each student’s city |
// | Selects node irrespective of where it is. | //age | Selects and display all ages |
@ | To access attribute value of XML tags | /students/student/@branch | Display each student’s branch |
[ ] | It is used to select specific nodes | /students/student[2]/name | Displays Aniket Chauhan |
Let’s Practice XPath
Consider above mentioned XML document:
Select 2nd student
/students/student[2]/name
Select all the students with branch IT
/students/student[@branch = /”IT/”]/name
Select all the students whose age is less than equal to 20
/students/student[age <= 20]/name
First 4 students
/students/student[position() <= 4]/name
Java Code to Evaluate XPath Expression
Java
import java.io.File; import javax.xml.parsers.DocumentBuilder; import javax.xml.parsers.DocumentBuilderFactory; import javax.xml.xpath.XPath; import javax.xml.xpath.XPathConstants; import javax.xml.xpath.XPathExpression; import javax.xml.xpath.XPathExpressionException; import javax.xml.xpath.XPathFactory; import org.w3c.dom.Document; import org.w3c.dom.NodeList; public class XPathDemo { public static void main(String[] args) throws Exception { File xmlFile = new File( "student.xml" ); // Get DOM DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance(); DocumentBuilder db = dbf.newDocumentBuilder(); Document xml = db.parse(xmlFile); xml.getDocumentElement().normalize(); // Get XPath XPathFactory xpf = XPathFactory.newInstance(); XPath xpath = xpf.newXPath(); // Find 2nd Student's name String name = (String)xpath.evaluate( "/students/student[2]/name" , xml, XPathConstants.STRING); System.out.println( "2nd Student Name: " + name); // find specific students name whose branch is IT NodeList nodes = (NodeList)xpath.evaluate( "/students/student[@branch = \"IT\"]/name" , xml, XPathConstants.NODESET); System.out.println( "\nStudents with branch IT:" ); printNodes(nodes); // find specific students // name whose age is less // than equal to 20 nodes = (NodeList)xpath.evaluate( "/students/student[age <= 20]/name" , xml, XPathConstants.NODESET); System.out.println( "\nStudents of age less than equal to 20:" ); printNodes(nodes); // First 4 students from XML document nodes = (NodeList)xpath.evaluate( "/students/student[position() < 5]/name" , xml, XPathConstants.NODESET); System.out.println( "\nFirst Four Students: " ); printNodes(nodes); } // prints nodes public static void printNodes(NodeList nodes) { for ( int i = 0 ; i < nodes.getLength(); i++) { System.out.println( (i + 1 ) + ". " + nodes.item(i).getTextContent()); } } } |
Output:
Explanation of classes and methods used in the above code:
- The javax.xml.Parsers.DocumentBuilder class defines the API to obtain DOM instances from an XML document.
- parse() method parse the content of the given file as an XML document and return a new DOM object.
- normalize() method normalize the content of the given file as an XML document.
- The javax.xml.xpath.XPathFactory class instance can be used to create XPath objects which contain evaluate() method to evaluate our written xpath and return string/Node/NodeSet, anyone, according to the passed parameter (see evaluate() method inside code).
- position() is an XPath function that returns the position of the currently specified tag. (In the above code specified tag is ‘student’). Similarly, XPath provides a list of useful functions, you can explore it.