Represents a position between two logical tokens in an XML document. The tokens themselves are not exposed as objects, but their type and properties are discoverable through methods on the cursor. In particular, the general category of token is represented by a {@link XmlCursor.TokenType TokenType}.
You use an XmlCursor instance to navigate through and manipulate an XML instance document. Once you obtain an XML document, you can create a cursor to represent a specific place in the XML. Because you can use a cursor with or without a schema corresponding to the XML, cursors are an ideal way to handle XML without a schema. You can create a new cursor by calling the {@link XmlTokenSource#newCursor() newCursor} method exposed by an object representing the XML, whether it was parsed into a strong type compiled from schema or an {@link XmlObject XmlObject} (as in the no-schema case).
With an XmlCursor, you can also:
- Execute XQuery and XPath expressions against the XML with the execQuery and selectPath methods.
- Edit and reshape the document by inserting, moving, copying, and removing XML.
- Insert bookmarks that "stick" to the XML at the cursor's position even if the cursor or XML moves.
- Get and set values for containers (elements and whole documents), attributes, processing instructions, and comments.
A cursor moves through XML by moving past tokens. A token represents a category of XML markup, such as the start of an element, its end, an attribute, comment, and so on. XmlCursor methods such as toNextToken, toNextSibling, toParent, and so on move the cursor among tokens. Each token's category is of a particular
type, represented by one of the nine types defined by the {@link XmlCursor.TokenType TokenType} class.
When you get a new cursor for a whole instance document, the cursor is intially located before the STARTDOC token. This token, which has no analogy in the XML specification, is present in this logical model of XML so that you may distinguish between the document as a whole and the content of the document. Terminating the document is an ENDDOC token. This token is also not part of the XML specification. A cursor located immediately before this token is at the very end of the document. It is not possible to position the cursor after the ENDDOC token. Thus, the STARTDOC and ENDDOC tokens are effectively "bookends" for the content of the document.
For example, for the following XML, if you were the navigate a cursor through the XML document using toNextToken(), the list of token types that follows represents the token sequence you would encounter.
<sample x='y'> <value>foo</value> </sample>
STARTDOC
START (sample)
ATTR (x='y')
TEXT ("\n ")
START (value)
TEXT ("foo")
END (value)
TEXT ("\n")
END (sample)
ENDDOC
When there are no more tokens available, hasNextToken() returns false and toNextToken() returns the special token type NONE and does not move the cursor.
The {@link #currentTokenType() currentTokenType()} method will return the type of the token that is immediately after the cursor. You can also use a number of convenience methods that test for a particular token type. These include the methods isStart(), isStartdoc(), isText(), isAttr(), and so on. Each returns a boolean value indicating whether the token that follows the cursor is the type in question.
A few other methods determine whether the token is of a kind that may include multiple token types. The isAnyAttr() method, for example, returns true if the token immediately following the cursor is any kind of attribute, including those of the ATTR token type and xmlns attributes.
Legitimate sequences of tokens for an XML document are described by the following Backus-Naur Form (BNF):
<doc> ::= STARTDOC <attributes> <content> ENDDOC <element> ::= START <attributes> <content> END <attributes> ::= ( ATTR | NAMESPACE ) <content> ::= ( COMMENT | PROCINST | TEXT | <element> )
Note that a legitimate sequence is STARTDOC ENDDOC, the result of creating a brand new instance of an empty document. Also note that attributes may only follow container tokens (STARTDOC or START)