net.sf.saxon.s9api
Class DocumentBuilder
java.lang.Object
net.sf.saxon.s9api.DocumentBuilder
public class DocumentBuilder
extends java.lang.Object
A document builder holds properties controlling how a Saxon document tree should be built, and
provides methods to invoke the tree construction.
This class has no public constructor. Users should construct a DocumentBuilder
by calling the factory method
Processor.newDocumentBuilder()
.
All documents used in a single Saxon query, transformation, or validation episode must
be built with the same
Configuration
. However, there is no requirement that they
should use the same
DocumentBuilder
.
XdmNode | build(File file) - Build a document from a supplied XML file
|
XdmNode | build(Source source) - Load an XML document, to create a tree representation of the document in memory.
|
URI | getBaseURI() - Get the base URI of documents loaded using this DocumentBuilder when no other URI is available.
|
SchemaValidator | getSchemaValidator() - Get the SchemaValidator used to validate documents loaded using this
DocumentBuilder .
|
WhitespaceStrippingPolicy | getWhitespaceStrippingPolicy() - Get the white whitespace stripping policy applied when loading a document
using this
DocumentBuilder .
|
boolean | isDTDValidation() - Ask whether DTD validation is to be applied to documents loaded using this
DocumentBuilder
|
boolean | isLineNumbering() - Ask whether line numbering is enabled for documents loaded using this
DocumentBuilder .
|
boolean | isRetainPSVI() - Ask whether the constructed tree should contain information derived from schema
validation, specifically whether it should contain type annotations and expanded
defaults of missing element and attribute content.
|
void | setBaseURI(URI uri) - Set the base URI of a document loaded using this
DocumentBuilder .
|
void | setDTDValidation(boolean option) - Set whether DTD validation should be applied to documents loaded using this
DocumentBuilder .
|
void | setLineNumbering(boolean option) - Set whether line numbering is to be enabled for documents constructed using this DocumentBuilder.
|
void | setRetainPSVI(boolean retainPSVI) - Set whether the constructed tree should contain information derived from schema
validation, specifically whether it should contain type annotations and expanded
defaults of missing element and attribute content.
|
void | setSchemaValidator(SchemaValidator validator) - Set the schemaValidator to be used.
|
void | setWhitespaceStrippingPolicy(WhitespaceStrippingPolicy policy) - Set the whitespace stripping policy applied when loading a document
using this
DocumentBuilder .
|
XdmNode | wrap(Object node) - Create a node by wrapping a recognized external node from a supported object model.
|
DocumentBuilder
protected DocumentBuilder(Configuration config)
Create a DocumentBuilder. This is a protected constructor. Users should construct a DocumentBuilder
by calling the factory method
Processor.newDocumentBuilder()
.
config
- the Saxon configuration
build
public XdmNode build(File file)
throws SaxonApiException
Build a document from a supplied XML file
- the XdmNode representing the root of the document tree
build
public XdmNode build(Source source)
throws SaxonApiException
Load an XML document, to create a tree representation of the document in memory.
source
- A JAXP Source object identifying the source of the document. This can always be
a javax.xml.transform.stream.StreamSource
or a javax.xml.transform.sax.SAXSource
.
An instance of javax.xml.transform.dom.DOMSource
is accepted provided that the Saxon support
code for DOM (in saxon9-dom.jar) is on the classpath.
If the source is an instance of NodeInfo
then the subtree rooted at this node
will be copied (applying schema validation if requested) to create a new tree.
Saxon also accepts an instance of PullSource
, which can be used to supply
a document that is to be parsed using a StAX parser.
- An
XdmNode
. This will be
the document node at the root of the tree of the resulting in-memory document.
getBaseURI
public URI getBaseURI()
Get the base URI of documents loaded using this DocumentBuilder when no other URI is available.
- the base URI to be used, or null if no value has been set.
getSchemaValidator
public SchemaValidator getSchemaValidator()
Get the SchemaValidator used to validate documents loaded using this
DocumentBuilder
.
- the SchemaValidator if one has been set; otherwise null.
getWhitespaceStrippingPolicy
public WhitespaceStrippingPolicy getWhitespaceStrippingPolicy()
Get the white whitespace stripping policy applied when loading a document
using this DocumentBuilder
.
- the policy for stripping whitespace-only text nodes
isDTDValidation
public boolean isDTDValidation()
Ask whether DTD validation is to be applied to documents loaded using this DocumentBuilder
- true if DTD validation is to be applied
isLineNumbering
public boolean isLineNumbering()
Ask whether line numbering is enabled for documents loaded using this
DocumentBuilder
.
By default, line numbering is disabled.
Line numbering is not available for all kinds of source: in particular,
it is not available when loading from an existing XmlDocument.
The resulting line numbers are accessible to applications using the
extension function saxon:line-number() applied to a node, or using the
Java method
NodeInfo.getLineNumber()
Line numbers are maintained only for element nodes; the line number
returned for any other node will be that of the most recent element. For an element node, the
line number is generally that of the closing angle bracket at the end of the start tag
(this is what a SAX parser notifies)
- true if line numbering is enabled
isRetainPSVI
public boolean isRetainPSVI()
Ask whether the constructed tree should contain information derived from schema
validation, specifically whether it should contain type annotations and expanded
defaults of missing element and attribute content. If no schema validator is set
then this option has no effect.
Not yet implemented.
- true, if the constructed tree will contain type annotations
and expanded defaults of missing element and attribute content. Return false, if the
tree that is returned will be the same as if schema validation did not take place
(except that if the document is invalid, no tree will be constructed)
setBaseURI
public void setBaseURI(URI uri)
Set the base URI of a document loaded using this
DocumentBuilder
.
This is used for resolving any relative URIs appearing
within the document, for example in references to DTDs and external entities.
This information is required when the document is loaded from a source that does not
provide an intrinsic URI, notably when loading from a Stream or a DOMSource. The value is
ignored when loading from a source that does have an intrinsic base URI.
uri
- the base URI of documents loaded using this DocumentBuilder
. This
must be an absolute URI.
setDTDValidation
public void setDTDValidation(boolean option)
Set whether DTD validation should be applied to documents loaded using this
DocumentBuilder
.
By default, no DTD validation takes place.
option
- true if DTD validation is to be applied to the document
setLineNumbering
public void setLineNumbering(boolean option)
Set whether line numbering is to be enabled for documents constructed using this DocumentBuilder.
This has the effect that the line number in the original source document is maintained in the constructed
tree, for each element node (and only for elements). The line number in question is generally the line number
on which the closing ">" of the element start tag appears.
By default, line numbers are not maintained.
Errors relating to document parsing and validation will generally contain line numbers whether or not
this option is set, because such errors are detected during document construction.
Line numbering is not available for all kinds of source: for example,
it is not available when loading from an existing DOM Document.
The resulting line numbers are accessible to applications using the
XPath extension function saxon:line-number() applied to a node, or using the
Java method
NodeInfo.getLineNumber()
Line numbers are maintained only for element nodes; the line number
returned for any other node will be that of the most recent element. For an element node, the
line number is generally that of the closing angle bracket at the end of the start tag
(this is what a SAX parser notifies)
option
- true if line numbers are to be maintained, false otherwise.
setRetainPSVI
public void setRetainPSVI(boolean retainPSVI)
Set whether the constructed tree should contain information derived from schema
validation, specifically whether it should contain type annotations and expanded
defaults of missing element and attribute content. If no schema validator is set
then this option has no effect. The default value is true.
Not yet implemented.
retainPSVI
- if true, the constructed tree will contain type annotations
and expanded defaults of missing element and attribute content. If false, the
tree that is returned will be the same as if schema validation did not take place
(except that if the document is invalid, no tree will be constructed)
setSchemaValidator
public void setSchemaValidator(SchemaValidator validator)
Set the schemaValidator to be used. This determines whether schema validation is applied to an input
document and whether type annotations in a supplied document are retained. If no schemaValidator
is supplied, then schema validation does not take place.
This option requires the schema-aware version of the Saxon product (Saxon-SA).
validator
- the SchemaValidator to be used
setWhitespaceStrippingPolicy
public void setWhitespaceStrippingPolicy(WhitespaceStrippingPolicy policy)
Set the whitespace stripping policy applied when loading a document
using this
DocumentBuilder
.
By default, whitespace text nodes appearing in element-only content
are stripped, and all other whitespace text nodes are retained.
policy
- the policy for stripping whitespace-only text nodes from
source documents
wrap
public XdmNode wrap(Object node)
throws IllegalArgumentException
Create a node by wrapping a recognized external node from a supported object model.
The support module for the external object model must be on the class path and registered
with the Saxon configuration.
It is best to avoid calling this method repeatedly to wrap different nodes in the same document.
Each such wrapper conceptually creates a new XDM tree instance with its own identity. Although the
memory is shared, operations that rely on node identity might not have the expected result. It is
best to create a single wrapper for the document node, and then to navigate to the other nodes in the
tree using S9API interfaces.
node
- the node in the external tree representation
- the supplied node wrapped as an XdmNode