This class represents a UTF-8 stream reader.
This reader supports surrogate char
pairs (representing characters in the range [U+10000 .. U+10FFFF]). It can also be used to read characters unicodes (31 bits) directly (ref. {@link #read()}).
Each invocation of one of the read()
methods may cause one or more bytes to be read from the underlying byte-input stream. To enable the efficient conversion of bytes to characters, more bytes may be read ahead from the underlying stream than are necessary to satisfy the current read operation.
Instances of this class can be reused for different input streams and can be part of a higher level component (e.g. parser) in order to avoid dynamic buffer allocation when the input source changes. Also wrapping using a java.io.BufferedReader
is unnescessary as instances of this class embed their own data buffers.
Note: This reader is unsynchronized and does not test if the UTF-8 encoding is well-formed (e.g. UTF-8 sequences longer than necessary to encode a character).
@author Jean-Marie Dautelle @version 2.0, December 9, 2004 @see UTF8StreamWriter
|
|
|
|
|
|