A Tuple represents a set of values. Consider a Tuple the same as a database record where every value is a column in that table.
A "tuple stream" is a set of Tuple instances passed consecutively through a Pipe assembly.
Tuples work in tandem with {@link Fields} and the {@link TupleEntry} classes. A TupleEntry holds an instance ofFields and a Tuple. It allows a tuple to be accessed by its field names, and will help maintain consistent types if any are given on the Fields instance. That is, if a field is specified at an Integer, calling {@link #set(int,Object)}with a String will force the String to be coerced into a Integer instance.
For managing custom types, see the {@link CoercibleType} interface which extends {@link Type}.
Tuple instances created by user code, by default, are mutable (or modifiable). Tuple instances created by the system are immutable (or unmodifiable, tested by calling {@link #isUnmodifiable()}).
For example tuples returned by {@link cascading.operation.FunctionCall#getArguments()}, will always be unmodifiable. Thus they must be copied if they will be changed by user code or cached in the local context. See the Tuple copy constructor, or {@code *Copy()} methodson {@link TupleEntry}.
Because a Tuple can hold any Object type, it is suitable for storing custom types. But all custom types must have a serialization support per the underlying framework.
For Hadoop, a {@link org.apache.hadoop.io.serializer.Serialization} implementationmust be registered with Hadoop. For further performance improvements, see the {@link cascading.tuple.hadoop.SerializationToken} Java annotation.
@see org.apache.hadoop.io.serializer.Serialization
@see cascading.tuple.hadoop.SerializationToken