A minimal InputStreamReader implementation that reads only what it needs from the wrapped InputStream and not a byte more. Java's built-in InputStreamReader can read ahead, but that doesn't play well with use cases that mix binary and text data in one stream, so this InputStreamReader variant was created.
Note that its original user, JsonStreamReader, reads only one character at a time, so that is how this implementation is optimized. This can be updated if other use cases come along. At the very least, this InputStreamReader does intelligently handle the first byte in a UTF-8 character by bulk-reading the remaining bytes based on the character length, which is encoded into the first byte in a UTF-8 character. Another optimization is having one-byte UTF-8 characters bypass the CharsetDecoder.
This Reader assumes that characters in the InputStream are all UTF-8. Byte order mark (BOM) characters encoded as UTF-8 are dropped, since their presence is nonsensical in UTF-8 (and we don't want to potentially contaminate any UTF-16 outputs). 5-byte and 6-byte UTF-8 byte sequences are dropped at this time. Bytes with values 0xfe or 0xff are dropped as neither are valid UTF-8 bytes.
Codepoints that are represented in UTF-16 as surrogate pairs are supported by this reader.
All source code are property of their respective owners. Java is a trademark of Sun Microsystems, Inc and owned by ORACLE Inc. Contact coftware#gmail.com.