util.StringSplitter
StringSplitter is a utility to divide strings into substrings (called tokens) by recognizing separators (called delimiters). It provides an Iterator called tokenizer that is a generalization of StringTokenizer. Whereas StringTokenizer allows only single-character delimiters, StringSplitter allows any string to be defined as a delimiter, and it adds pre-defined delimiter sets called: whitespace, newline, word, paragraph, character, operator. There can be any of a number of different delimiters. Unlike StringTokenizer, the splitting is done when the object is instantiated, so that the delimiters are fixed once and for all; it is not possible to change delimiter sets for different parts of the string. Constructors are provided to allow delimiters to be specified as an array of strings, as a StringVector, as a single string (whose characters are assumed to be delimiters, as in StringTokenizer), or as a keyword for some pre-defined delimiter sets: whitespace (the default here and in StringTokenizer), newline, word (uses any punctuation mark or whitespace as a delimiter), paragraph (splits on blank lines or newlines followed by a tab or space), character (no delimiters: each character is a token), and operator (splits on arithmetic operators). A boolean argument allows the user to choose whether to include the delimiters as tokens. The default is false: no delimiters. Various extensions for returning the tokens in a convenient form are provided. Although the Iterator is provided for compatibility with StringTokenizer, tokens can be more flexibly accessed by using other methods that work by the index of the token, since StringSplitter is an extension of the ArrayList class. Therefore all the methods of the ArrayList class are available in StringSplitter. A test is also provided to see if the input string begins with a delimiter or not; this is useful if delimiter strings are returned as tokens, since one may want to know if the odd-numbered tokens or the even ones are delimiters. Another test is provided to see if the input string ends in a delimiter.
@author B F Schutz, modified by Ian Taylor, 2004
@created 12 July 1998
@version $Revision: 1.1 $