Parses the document into a tree of nodes using the {@link NodeTokenizer}. Nodes are defined by a token or offset range in the document, {@link Token}. Attributes in beginning nodes are also parsed into token offsets by the {@link AttributeTokenizer}.
A document tree is built representing nodes in the target document. The document can be a HTML fragment that is not well-formed or an XML fragment of a XHTML document.
|
|
|
|
|
|
|
|
|
|