The csv tokenizer class allows an application to break a Comma Separated Value format into tokens. The tokenization method is much simpler than the one used by the
StringTokenizer
class. The
CSVTokenizer
methods do not distinguish among identifiers, numbers, and quoted strings, nor do they recognize and skip comments.
The set of separator (the characters that separate tokens) may be specified either at creation time or on a per-token basis.
An instance of
CSVTokenizer
behaves in one of two ways, depending on whether it was created with the
returnSeparators
flag having the value
true
or
false
:
- If the flag is
false
, delimiter characters serve to separate tokens. A token is a maximal sequence of consecutive characters that are not separator. - If the flag is
true
, delimiter characters are themselves considered to be tokens. A token is thus either one delimiter character, or a maximal sequence of consecutive characters that are not separator.
A CSVTokenizer object internally maintains a current position within the string to be tokenized. Some operations advance this current position past the characters processed.
A token is returned by taking a substring of the string that was used to create the CSVTokenizer object.
The following is one example of the use of the tokenizer. The code:
CSVTokenizer csvt = new CSVTokenizer("this,is,a,test"); while (csvt.hasMoreTokens()) { println(csvt.nextToken()); }
prints the following output:
this is a test
@author abupon