Class TemplateTap can be used to write tuple streams out to sub-directories based on the values in the {@link Tuple}instance.
The constructor takes a {@link Hfs} {@link cascading.tap.Tap} and a {@link java.util.Formatter} format syntax String. This allowsTuple values at given positions to be used as directory names. Note that Hadoop can only sink to directories, and all files in those directories are "part-xxxxx" files.
{@code openTapsThreshold} limits the number of open files to be output to. This value defaults to 300 files.Each time the threshold is exceeded, 10% of the least recently used open files will be closed.
TemplateTap will populate a given {@code pathTemplate} without regard to case of the values being used. Thusthe resulting paths {@code 2012/June/} and {@code 2012/june/} will likely result in two open files into the samelocation. Forcing the case to be consistent with an upstream {@link cascading.operation.Function} is recommended, see{@link cascading.operation.expression.ExpressionFunction}.
Though Hadoop has no mechanism to prevent simultaneous writes to a directory from multiple jobs, it doesn't mean its safe to do so. Same is true with the TemplateTap. Interleaving writes to a common parent (root) directory across multiple flows will very likely lead to data loss.