A TextLine is a type of {@link cascading.scheme.Scheme} for plain text files. Files are broken intolines. Either line-feed or carriage-return are used to signal end of line.
By default, this scheme returns a {@link Tuple} with two fields, "offset" and "line".
Many of the constructors take both "sourceFields" and "sinkFields". sourceFields denote the field names to be used instead of the names "offset" and "line". sinkFields is a selector and is by default {@link Fields#ALL}. Any available field names can be given if only a subset of the incoming fields should be used.
If a {@link Fields} instance is passed on the constructor as sourceFields having only one field, the return tupleswill simply be the "line" value using the given field name.
Note that TextLine will concatenate all the Tuple values for the selected fields with a TAB delimiter before writing out the line.
Note sink compression is {@link Compress#DISABLE} by default. If {@code null} is passed to the constructorfor the compression value, it will remain disabled.
If any of the input files end with ".zip", an error will be thrown. *
By default, all text is encoded/decoded as UTF-8. This can be changed via the {@code charsetName} constructorargument.