Represents a stream of tokens parsed from a {@code String}.
The Java standard library provides many classes that you might think would be useful for implementing this, but aren't. For example:
- {@code java.io.StreamTokenizer}: This almost does what we want -- or, at least, something that would get us close to what we want -- except for one fatal flaw: It automatically un-escapes strings using Java escape sequences, which do not include all the escape sequences we need to support (e.g. '\x').
- {@code java.util.Scanner}: This seems like a great way at least to parse regular expressions out of a stream (so we wouldn't have to load the entire input into a single string before parsing). Sadly, {@code Scanner} requires that tokens be delimited with somedelimiter. Thus, although the text "foo:" should parse to two tokens ("foo" and ":"), {@code Scanner} would recognize it only as a single token. Furthermore, {@code Scanner} provides noway to inspect the contents of delimiters, making it impossible to keep track of line and column numbers.
Luckily, Java's regular expression support does manage to be useful to us. (Barely: We need {@code Matcher.usePattern()}, which is new in Java 1.5.) So, we can use that, at least. Unfortunately, this implies that we need to have the entire input in one contiguous string.