The string tokenizer class allows an application to break a string into tokens by performing code point comparison. The StringTokenizer
methods do not distinguish among identifiers, numbers, and quoted strings, nor do they recognize and skip comments.
The set of delimiters (the codepoints that separate tokens) may be specified either at creation time or on a per-token basis.
An instance of StringTokenizer
behaves in one of three ways, depending on whether it was created with the returnDelims
and coalesceDelims
flags having the value true
or false
:
false
, delimiter code points serve to separate tokens. A token is a maximal sequence of consecutive code points that are not delimiters. true
, delimiter code points are themselves considered to be tokens. In this case, if coalesceDelims is true
, such tokens will be the maximal sequence of consecutive code points that are delimiters. If coalesceDelims is false, a token will be received for each delimiter code point. A token is thus either one delimiter code point, a maximal sequence of consecutive code points that are delimiters, or a maximal sequence of consecutive code points that are not delimiters.
A StringTokenizer object internally maintains a current position within the string to be tokenized. Some operations advance this current position past the code point processed.
A token is returned by taking a substring of the string that was used to create the StringTokenizer object.
Example of the use of the default delimiter tokenizer.
StringTokenizer st = new StringTokenizer("this is a test"); while (st.hasMoreTokens()) { println(st.nextToken()); }
prints the following output:
this is a test
Example of the use of the tokenizer with user specified delimiter.
StringTokenizer st = new StringTokenizer("this is a test with supplementary characters \ud800\ud800\udc00\udc00", " \ud800\udc00"); while (st.hasMoreTokens()) { println(st.nextToken()); }
prints the following output:
@author syn wee @stable ICU 2.4this is a test with supplementary characters \ud800 \udc00
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|