// The "each" primitive allows us to apply either filters or functions to the stream
// We always have to select the input fields.
topology
.newStream("filter", spout)
.each(new Fields("actor"), new RegexFilter("pere"))
.each(new Fields("text", "actor"), new Print());
// Functions describe their output fields, which are always appended to the input fields.
// As you see, Each operations can be chained.
topology
.newStream("function", spout)
.each(new Fields("text"), new ToUpperCase(), new Fields("uppercased_text"))
.each(new Fields("text", "uppercased_text"), new Print());
// You can prune unnecessary fields using "project"
topology
.newStream("projection", spout)
.each(new Fields("text"), new ToUpperCase(), new Fields("uppercased_text"))
.project(new Fields("uppercased_text"))
.each(new Fields("uppercased_text"), new Print());
// Stream can be parallelized with "parallelismHint"
// Parallelism hint is applied downwards until a partitioning operation (we will see this later).
// This topology creates 5 spouts and 5 bolts:
// Let's debug that with TridentOperationContext.partitionIndex !
topology
.newStream("parallel", spout)
.each(new Fields("actor"), new RegexFilter("pere"))
.parallelismHint(5)
.each(new Fields("text", "actor"), new Print());
// You can perform aggregations by grouping the stream and then applying an aggregation
// Note how each actor appears more than once. We are aggregating inside small batches (aka micro batches)
// This is useful for pre-processing before storing the result to databases
topology
.newStream("aggregation", spout)
.groupBy(new Fields("actor"))
.aggregate(new Count(),new Fields("count"))
.each(new Fields("actor", "count"),new Print())
;
// In order ot aggregate across batches, we need persistentAggregate.
// This example is incrementing a count in the DB, using the result of these micro batch aggregations
// (here we are simply using a hash map for the "database")