An Aggregator takes the set of all values associated with a unique grouping and returns zero or more values. {@link cascading.operation.aggregator.MaxValue}, {@link cascading.operation.aggregator.MinValue}, {@link cascading.operation.aggregator.Count}, and {@link cascading.operation.aggregator.Average} are good examples.
Aggregator implementations should be reentrant. There is no guarantee an Aggregator instance will be executed in a unique vm, or by a single thread. The {@link #start(cascading.flow.FlowProcess,AggregatorCall)}method provides a mechanism for maintaining a 'context' object to hold intermediate values.
Note {@link TupleEntry} instances are reused internally so should not be stored. Instead use the TupleEntry or Tuplecopy constructors to make safe copies.
Since Aggregators can be chained, and Cascading pipelines all operation results, any Aggregators coming ahead of the current Aggregator must return a value before the {@link #complete(cascading.flow.FlowProcess,AggregatorCall)}method on this Aggregator is called. Subsequently, if any previous Aggregators return more than one Tuple result, this complete() method will be called for each Tuple emitted.
Thus it is a best practice to implement a {@link Buffer} when emitting more than one, or zero Tuple results.
@see AggregatorCall
@see OperationCall