XML - XPath Functions and Operators (Advanced Level)
XPath is a powerful query language used to navigate and extract data from XML documents. While basic XPath focuses on selecting nodes using simple paths, the advanced use of XPath relies heavily on functions, operators, axes, and predicates to perform complex queries and transformations. Understanding these elements allows precise data retrieval, filtering, and manipulation.
1. XPath Functions Overview
XPath provides a rich set of built-in functions grouped into categories such as node, string, numeric, boolean, and date/time (in XPath 2.0+). These functions help process and evaluate XML data efficiently.
Node Functions
These functions operate on node sets:
-
last()returns the position of the last node in a node set. -
position()returns the current node’s position. -
count(node-set)counts the number of nodes. -
name()returns the name of a node. -
local-name()returns the local part of a node name without namespace.
Example:
/bookstore/book[last()]
Selects the last book element.
2. String Functions
String functions manipulate text within XML nodes:
-
contains(string1, string2)checks if one string contains another. -
starts-with(string1, string2)checks prefix. -
substring(string, start, length)extracts part of a string. -
string-length()returns length of a string. -
normalize-space()removes extra whitespace.
Example:
//book[contains(title, 'XML')]
Selects books whose title contains the word “XML”.
3. Numeric Functions
These functions handle numerical operations:
-
sum(node-set)calculates total values. -
round(),floor(),ceiling()adjust numbers. -
number()converts values to numeric form.
Example:
sum(//book/price)
Calculates total price of all books.
4. Boolean Functions
Boolean functions evaluate conditions:
-
true()andfalse()return boolean values. -
not()negates a condition. -
boolean()converts values to boolean.
Example:
//book[not(@available)]
Selects books without the "available" attribute.
5. XPath Operators
Operators are used to compare values and combine conditions.
Comparison Operators
-
=equal -
!=not equal -
<,>,<=,>=
Example:
//book[price > 500]
Logical Operators
-
and -
or
Example:
//book[price > 300 and price < 700]
Arithmetic Operators
-
+,-,*,div,mod
Example:
//book[price * 2 > 1000]
6. Axes in XPath
Axes define relationships between nodes and are essential for advanced navigation.
Common axes include:
-
child::selects children -
parent::selects parent -
ancestor::selects all ancestors -
descendant::selects all descendants -
following-sibling::selects next siblings -
preceding-sibling::selects previous siblings
Example:
//book/ancestor::library
Selects the library element containing the book.
7. Predicates for Filtering
Predicates use square brackets to filter nodes based on conditions.
Examples:
//book[1]
Selects the first book.
//book[@category='fiction']
Filters books by category.
//book[price > 500][author='John']
Applies multiple conditions.
8. Combining Functions and Predicates
Advanced XPath often combines multiple techniques for precise selection.
Example:
//book[contains(title, 'XML') and price < 500]
Example with position:
//book[position() <= 3]
Selects first three books.
9. XPath 2.0 Enhancements (Advanced Insight)
In XPath 2.0 and later:
-
Support for sequences instead of node sets
-
Additional functions like
matches(),replace(),tokenize() -
Stronger data typing (string, integer, date)
Example:
//book[matches(title, '^XML')]
Conclusion
Advanced XPath functions and operators transform simple queries into powerful data extraction tools. By combining functions, axes, predicates, and operators, users can navigate complex XML structures with precision. Mastery of these concepts is essential for working with XML in real-world applications such as data integration, web services, and document processing systems.