Decodes (unescapes) HTML entities with the complication that these are received one character at a time hence must be stored temporarily. Also, we may receive some "junk" characters before the actual entity which we will discard.
This class is designed to be 100% compatible with the corresponding logic in the C-version of the {@link com.google.security.streamhtmlparser.HtmlParser}, found in htmlparser.c
. There are however a few intentional differences outlines below:
processChar
returns the output {@code String} whereas in Java, we returna status code and then provide the {@code String} in a separatemethod getEntity
. It is cleaner as it avoids the need to return empty {@code String}s during incomplete processing. Valid HTML entities have one of the following three forms:
ⅆ
where dd is a number in decimal (base 10) form. &x|Xyy;
where yy is a hex-number (base 16). &<html-entity>;
where <html-entity>
is one of lt
, gt
, amp
, quot
or apos
. A reset
method is provided to facilitate object re-use.
EntityResolver is thread-safe.
@since 1.1 @author Andrus AdamchikIf a SAX application needs to implement customized handling for external entities, it must implement this interface and register an instance with the SAX driver using the {@link org.xml.sax.XMLReader#setEntityResolver setEntityResolver}method.
The XML reader will then allow the application to intercept any external entities (including the external DTD subset and external parameter entities, if any) before including them.
Many SAX applications will not need to implement this interface, but it will be especially useful for applications that build XML documents from databases or other specialised input sources, or for applications that use URI types other than URLs.
The following resolver would provide the application with a special character stream for the entity with the system identifier "http://www.myhost.com/today":
import org.xml.sax.EntityResolver; import org.xml.sax.InputSource; public class MyResolver implements EntityResolver { public InputSource resolveEntity (String publicId, String systemId) { if (systemId.equals("http://www.myhost.com/today")) { // return a special input source MyReader reader = new MyReader(); return new InputSource(reader); } else { // use the default behaviour return null; } } }
The application can also use this interface to redirect system identifiers to local URIs or to look up replacements in a catalog (possibly by using the public identifier).
@since SAX 1.0 @author David Megginson @version 2.0.1 (sax2r2) @see org.xml.sax.XMLReader#setEntityResolver @see org.xml.sax.InputSource
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|