Examples of storm.trident.TridentTopology.newStream()

storm.trident.TridentTopology.newStream()

        // You can also use it in other topologies by using the TridentState object returned.
        //
        // The state is commonly backed by a data store like memcache, cassandra etc.
        // Here we are simply using a hash map
        TridentState countState =
                topology
                        .newStream("spout", spout)
                        .groupBy(new Fields("actor"))
                        .persistentAggregate(new MemoryMapState.Factory(), new Count(), new Fields("count"));


        // There are a few ready-made state libraries that you can use

View Full Code Here

        // Also, Spouts are "batched".
        TridentTopology topology = new TridentTopology();


        // The "each" primitive allows us to apply either filters or functions to the stream
        // We always have to select the input fields.
        topology
                .newStream("filter", spout)
                .each(new Fields("actor"), new RegexFilter("pere"))
                .each(new Fields("text", "actor"), new Print());


        // Functions describe their output fields, which are always appended to the input fields.

View Full Code Here

                .each(new Fields("actor"), new RegexFilter("pere"))
                .each(new Fields("text", "actor"), new Print());


        // Functions describe their output fields, which are always appended to the input fields.
        // As you see, Each operations can be chained.
        topology
                .newStream("function", spout)
                .each(new Fields("text"), new ToUpperCase(), new Fields("uppercased_text"))
                .each(new Fields("text", "uppercased_text"), new Print());


        // You can prune unnecessary fields using "project"

View Full Code Here

                .newStream("function", spout)
                .each(new Fields("text"), new ToUpperCase(), new Fields("uppercased_text"))
                .each(new Fields("text", "uppercased_text"), new Print());


        // You can prune unnecessary fields using "project"
        topology
                .newStream("projection", spout)
                .each(new Fields("text"), new ToUpperCase(), new Fields("uppercased_text"))
                .project(new Fields("uppercased_text"))
                .each(new Fields("uppercased_text"), new Print());

View Full Code Here


        // Stream can be parallelized with "parallelismHint"
        // Parallelism hint is applied downwards until a partitioning operation (we will see this later).
        // This topology creates 5 spouts and 5 bolts:
        // Let's debug that with TridentOperationContext.partitionIndex !
        topology
                .newStream("parallel", spout)
                .each(new Fields("actor"), new RegexFilter("pere"))
                .parallelismHint(5)
                .each(new Fields("text", "actor"), new Print());

View Full Code Here

                .each(new Fields("text", "actor"), new Print());


        // You can perform aggregations by grouping the stream and then applying an aggregation
        // Note how each actor appears more than once. We are aggregating inside small batches (aka micro batches)
        // This is useful for pre-processing before storing the result to databases
        topology
                .newStream("aggregation", spout)
                .groupBy(new Fields("actor"))
                .aggregate(new Count(),new Fields("count"))
                .each(new Fields("actor", "count"),new Print())
        ;

View Full Code Here

        ;


        // In order ot aggregate across batches, we need persistentAggregate.
        // This example is incrementing a count in the DB, using the result of these micro batch aggregations
        // (here we are simply using a hash map for the "database")
        topology
                .newStream("aggregation", spout)
                .groupBy(new Fields("actor"))
                .persistentAggregate(new MemoryMapState.Factory(),new Count(),new Fields("count"))
        ;

View Full Code Here


        TridentTopology topology = new TridentTopology();


        // We have seen how to use groupBy, but you can use a more low-level form of aggregation as well
        // This example keeps track of counts, but this time it aggregates the result into a hash map
        topology
                .newStream("aggregation", spout)
                .aggregate(new Fields("location"), new StringCounter(), new Fields("aggregated_result"))
                .parallelismHint(3)
                ;

View Full Code Here

                .aggregate(new Fields("location"), new StringCounter(), new Fields("aggregated_result"))
                .parallelismHint(3)
                ;


        // We can affect how the processing is parallelized by using "partitioning"
        topology
                .newStream("aggregation", spout)
                .partitionBy(new Fields("location"))
                .partitionAggregate(new Fields("location"), new StringCounter(), new Fields("aggregated_result"))
                .parallelismHint(3)
        ;

View Full Code Here

        // aggregators. In the later, all input with a given location are routed to the same instance of aggregation.
        // This means that, more summarization can be done in the later, which would make subsequent processing more
        // efficient. However, note that if your input is skewed, the workload can become skewed, too


        // Here is an example how to deal with such skews
        topology
                .newStream("aggregation", spout)
                .partitionBy(new Fields("location"))
                .partitionAggregate(new Fields("location"), new StringCounter(), new Fields("count_map"))
                .each(new Fields("count_map"), new HasSpain())
                .each(new Fields("count_map"), new Print("AFTER-HAS-SPAIN"))

View Full Code Here

0 1 2 3 4 5

TOP

All source code are property of their respective owners. Java is a trademark of Sun Microsystems, Inc and owned by ORACLE Inc. Contact coftware#gmail.com.