A wrapper around a SequenceFile that contains Cascading's Tuples that implements a Pangool-friendly InputFormat. The Schema is lazily discovered with the first seen Cascading Tuple. The type correspondence is:
- Integer - INT
- Long - LONG
- Float - FLOAT
- Double - DOUBLE
- String - STRING
- Short - INT
- Boolean - BOOLEAN
Any other type is unrecognized and an IOException is thrown.
Column names must be provided to the InputFormat, this is of course because Cascading doesn't save them anywhere. The schemaName is used to instantiate a Pangool Schema.
Note that for this to work Cascading serialization must have been enabled in Hadoop Configuration. You can do this by calling static method {@link #setSerializations(Configuration)}.