Parsing an XML Document with XPath
The getter methods in the org.w3c.dom
package API are commonly used to parse an XML document. But J2SE 5.0 also provides the package to parse an XML document with the . The JDOM class also has methods to select XML document node(s) with an XPath expression, which consists of a location path of an XML document node or a list of nodes.
Parsing an XML document with an XPath expression is more efficient than the getter methods, because with XPath expressions, an Element
node may be selected without iterating over a node list. Node lists retrieved with the getter methods have to be iterated over to retrieve the value of element nodes. For example, the second article
node in the journal
node in the example XML document in this tutorial (listed in the Overview section below) may be retrieved with the XPath expression:
Element article=(Element)
(xPath.evaluate("/catalog/journal/article[2]/title",
inputSource,
XPathConstants.NODE));
In the code snippet, xPath
is an javax.xml.xpath.XPath
class object, and inputSource
is an InputSource
object for an XML document. With the org.w3c.dom
package getter methods, the second article
node in the journal
node is retrieved with the code snippet:
Document document;
NodeList nodeList=document.getElementsByTagName("journal");
Element journal=(Element)(nodeList.item(0));
NodeList nodeList2=journal.getElementsByTagName("article");
Element article=(Element)nodeList2.item(1);
Also, with an XPath expression, an Attribute
node may be selected directly, in comparison to the getter methods, in which an Element
node is required to be evaluated before an Attribute
node is evaluated. For example, the value of the level
attribute for the article
node with the date
January-2004
is retrieved with an XPath expression:
String level =
xPath.evaluate("/catalog/journal/article[@date='January-2004']/@level",
inputSource);
By comparison, the org.w3c.dom
package makes you retrieve the org.w3c.dom.Element
object for the article
, and then get its level
attribute with:
String level=article.getAttribute("level");
In this tutorial, an example XML document is parsed with J2SE 5.0's XPath
class and JDOM's XPath
class. XML document nodes are selected with XPath expressions. Depending on the XPath expression evaluated, the nodes selected are either org.w3c.dom.Element
nodes or org.w3c.dom.Attribute
nodes. The example XML document, catalog.xml, is listed below:
Design XML Schemas Using UML
Ayesha Malik
Design service-oriented architecture
frameworks with J2EE technology
Naveen Balani
Advance DAO Programming
Sean Sullivan
The example XML document has a namespace declaration, xmlns:journal="
, for elements in the journal
prefix namespace.
This article is structured into the following sections:
- Preliminary Setup
- Parsing with the JDK 5.0 XPath Class
- Parsing with the JDOM XPath Class
Preliminary Setup
To use J2SE 5.0's XPath support, the javax.xml.xpath
package needs to be in the CLASSPATH
. Install the new version of the J2SE 5.0 SDK. To parse an XML document with the JDK 5.0 XPath class, add the
jrelibrt.jar file to the CLASSPATH
variable, if it's not already in the CLASSPATH
.
is the directory in which JDK 5.0 is installed.
The org.apache.xpath.NodeSet
class is required in the CLASSPATH
. Install ; extract xalan-j-current-bin.jar to a directory. Add
/bin/xalan.jar to the CLASSPATH
, where
is the directory in which Xalan-Java is installed.
To parse an XML document with the JDOM XPath
class, the JDOM API classes need to be in the CLASSPATH
. Install JDOM; extract the jdom-b9.zip file to an installation directory. Add
/jdom-b9/build/jdom.jar,
/jdom-b9/lib/saxpath.jar,
/jdom-b9/lib/jaxen-core.jar,
/jdom-b9/lib/jaxen-jdom.jar, and
/jdom-b9/lib/xerces.jar to the CLASSPATH
variable, where
is the directory in which JDOM is installed.
Parsing with the JDK 5.0 XPath Class
The javax.xml.xpath
package in J2SE 5.0 has classes and interfaces to parse an XML document with XPath. Some of the classes and interfaces in JDK 5.0 are listed in the following table:
Class/Interface | Description |
XPath (interface) | Provides access to the XPath evaluation environment. Provides the evaluate methods to evaluate XPath expressions in an XML document. |
XPathExpression (interface) | Provides the evaluate methods to evaluate compiled XPath expressions in an XML document. |
XpathFactory (class) | Used to create an XPath object. |
..
blocks. it looks right in my browser, but
is it legal html, or should each descriptive section be its
own paragraph? ca -->In this section, the example XML document is evaluated with the javax.xml.xpath.XPath
class. First, import the javax.xml.xpath
package.
import javax.xml.xpath.*;
The evaluate
methods in the XPath
and XPathExpression
interfaces are used to parse an XML document with XPath expressions. The XPathFactory
class is used to create an XPath
object. Create an XPathFactory
object with the static newInstance
method of the XPathFactory
class.
XPathFactory factory=XPathFactory.newInstance();
Create an XPath
object from the XPathFactory
object with the newXPath
method.
XPath xPath=factory.newXPath();
Create and compile an XPath expression with the compile
method of the XPath
object. As an example, select the title
of the article with its date
attribute set to January-2004
. An attribute in an XPath expression is specified with an @
symbol. For further reference on XPath expressions, see the for examples on creating an XPath expression.
XPathExpression xPathExpression=
xPath.compile("/catalog/journal/article[@date='January-2004']/title");
Create an InputSource
for the example XML document. An InputSource
is a input class for an XML entity. The evaluate
method of the XPathExpression
interface evaluates either an InputSource
or a node/node list of the types org.w3c.dom.Node
, org.w3c.dom.NodeList
, or org.w3c.dom.Document
.
InputSource inputSource =
new InputSource(new
FileInputStream(xmlDocument)));
xmlDocument
is the java.io.File object
of the example XML document.
File xmlDocument =
new File("c:/catalog/catalog.xml");
Evaluate the XPath expression with the InputSource
of the example XML document to evaluate over.
String title =
xPathExpression.evaluate(inputSource);
The result of the XPath expression evaluation is the title: Design service-oriented architecture frameworks with J2EE technology.
The XPath
object may be directly evaluated to evaluate the value of an XPath expression in an XML document without first compiling an XPath expression. Create an InputSource
.
inputSource =
new InputSource(new FileInputStream(xmlDocument)));
As an example, evaluate the value of the publisher
node in the journal
element.
String publisher =
xPath.evaluate("/catalog/journal/@publisher", inputSource);
The result of the XPath
object evaluation is the attribute value: IBM developerWorks
. The evaluate
method in the XPath
class may also be used to evaluate a node set. For example, select the node or set of nodes that correspond to the article
element nodes in the XML document. Create the XPath expression that represents a node set.
String expression="/catalog/journal/article";
Select the node set of article
element nodes in the example XML document with the evaluate
method of the XPath
object.
NodeSet nodes =
(NodeSet) xPath.evaluate(expression,
inputSource, XPathConstants.NODESET);
XpathConstants.NODESET
specifies the return type of the evaluate
method as a NodeSet
. The return type may also be set to NODE
, STRING
, BOOLEAN
or NUMBER
. The NodeSet
class implements the NodeList
interface. To parse the nodes in the node set, cast the NodeSet
object to NodeList
.
NodeList nodeList=(NodeList)nodes;
Thus, nodes in an XML document get selected and evaluated without iterating over the getter methods of the org.w3c.dom
API. The example program XPathEvaluator.java
is used to parse an XML document with the JDK 5.0 XPath
class.
Parsing with the JDOM XPath Class
The JDOM API XPath
class supports XPath expression to select nodes from an XML document. Some of the methods in the JDOM XPath
class are illustrated in the following table:
XPath Class Method | Description |
selectSingleNode | Used to select a single node that matches an XPath expression. |
selectNodes | Used to select a list of nodes that match an XPath expression. |
addNamespace | Used to add a namespace to match an XPath expression with namespace prefixes. |
In this section, the procedure to select nodes from the example XML document catalog.xml with the JDOM XPath
class shall be discussed. The node/nodes selected by the select
methods are modified, and the modified document is output to an XML document. First, import the JDOM org.jdom.xpath
package classes.
import org.jdom.xpath.*;
Create a SAXBuilder
.
SAXBuilder saxBuilder =
new SAXBuilder("org.apache.xerces.parsers.SAXParser");
Parse the XML document catalog.xml with the SAXBuilder
.
org.jdom.Document jdomDocument =
saxBuilder.build(xmlDocument);
xmlDocument
is the java.io.File
representation of the XML document catalog.xml. The static method selectSingleNode(java.lang.Object context, String XPathExpression)
selects a single node specified by an XPath expression. If more than one nodes match the XPath expression, the first node that matches the XPath expression gets selected. Select the attribute node level
of an element article
in a journal
with title
set to Java Technology
, and with article
attribute date
set to January-2004
, with an XPath expression.
org.jdom.Attribute levelNode =
(org.jdom.Attribute)(XPath.selectSingleNode(
jdomDocument,
"/catalog//journal[@title='JavaTechnology']" +
"//article[@date='January-2004']/@level"));
The level
attribute value Advanced
gets selected. Modify the level
node.
levelNode.setValue("Intermediate");
The selectSingleNode
method may also be used to select an element node in an XML document. As an example, select a title
node. Select the title
node with an XPath expression.
org.jdom.Element titleNode =
(org.jdom.Element) XPath.selectSingleNode( jdomDocument,
"/catalog//journal//article[@date='January-2004']/title");
The title
node with value Design service-oriented architecture frameworks with J2EE technology
gets selected. Modify the title
node.
titleNode.setText(
"Service Oriented Architecture Frameworks");
The static method selectNodes(java.lang.Object context, String XPathExpression)
selects all of the nodes specified by an XPath expression. Select all of the article
nodes for the journal
with a title
set to Java Technology
.
java.util.List nodeList =
XPath.selectNodes(jdomDocument,
"/catalog//journal[@title='Java Technology']//article");
Modify the article
nodes. Add an attribute to the article
nodes.
Iterator iter=nodeList.iterator();
while(iter.hasNext()) {
org.jdom.Element element =
(org.jdom.Element) iter.next();
element.setAttribute("section", "Java Technology");
}
The JDOM XPath
class supports selection of nodes with namespace prefixes. To select a node with a namespace, add a namespace to an XPath:
XPath xpath =
XPath.newInstance(
"/catalog//journal:journal//article/@journal:level");
xpath.addNamespace("journal",
"
);
A namespace with the prefix journal
gets added to the XPath
object. Select a node with a namespace prefix:
levelNode = (org.jdom.Attribute)
xpath.selectSingleNode(jdomDocument);
The attribute node journal:level
gets selected. Modify the journal:level
node.
levelNode.setValue("Advanced");
The Java program JDomParser.java is used to select nodes from the catalog.xml XML document. In this section, the procedure to select nodes from an XML document with the JDOM XPath
class select
methods was explained. The nodes selected are modified. The modified document is output to a XML document with the XMLOutputter
class. catalog-modified.xml is the output XML document.
Conclusion
In this tutorial, an XML document was parsed with XPath. XPath is used only to select nodes. XPath APIs discussed in this tutorial do not have the provision to set values for XML document nodes with XPath. To set values for nodes, the setter methods of the org.w3c.dom
package are required.
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/71047/viewspace-996770/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- XML DOM(Document Object Model)XMLObject
- Springboot Error parsing Mapper XMLSpring BootErrorAPPXML
- xPath 動態分離XML資料XML
- Error parsing XML: An invalid XML character (Unicode:0x1f) was foundErrorXMLUnicode
- JavaScript JavaScript與XML——“XPath”的注意要點JavaScriptXML
- 關於XML字串和XML Document之間的轉換薦XML字串
- JAXP 再述??Sun 的 Java API for XML Parsing,1.1 版(轉)JavaAPIXML
- Office 365 - For security reasons DTD is prohibited in this XML documentXML
- PARSING_USER_ID,PARSING_SCHEMA_ID,PARSING_SCHEMA_NAME in V$SQLSQL
- Java解析XML彙總(DOM/SAX/JDOM/DOM4j/XPath)JavaXML
- com.badlogic.gdx.utils.SerializationException: Error parsing XML on line 1 nearExceptionErrorXML
- HtmlAgilityPack System.Xml.XPath.IXPathNavigable”在未被引用的程式集中定義HTMLXML
- Xpath
- Error parsing Mapper XML. Cause: java.lang.IllegalArgumentException: Result Maps collection alreadyErrorAPPXMLJavaException
- SQL Parsing Flow DiagramSQL
- Java-進階篇【Junit單元測試、反射、註解、動態代理、XML、XML解析、XPath、設計模式】---10Java反射XML設計模式
- 怎麼解析 xml 檔案,把裡面某個元素,自動生成其 xpathXML
- document,document.documentElement區別
- XPath 教程
- xpath解析
- 初始xpath
- parsing html in asp.netHTMLASP.NET
- No grammar constraints (DTD or XML Schema) referenced in the document.的兩種解決辦法AIXML
- Elasticsearch DocumentElasticsearch
- Document物件物件
- Oracle DocumentOracle
- Xpath helper外掛
- 爬蟲 – xpath 匹配爬蟲
- python xpath用法Python
- XPath 語法概述
- jsoup、xpath教程JS
- document load 和 document ready 的區別
- document load 和document ready的區別?
- document.domainAI
- document.createDocumentFragment()Fragment
- JavaScript document物件JavaScript物件
- document.writeln()
- document.getElementsByName()