Examples of cascading.cascade.Cascade

cascading.cascade.Cascade
A Cascade is an assembly of {@link cascading.flow.Flow} instances that share or depend on equivalent {@link Tap} instances and are executed asa single group. The most common case is where one Flow instance depends on a Tap created by a second Flow instance. This dependency chain can continue as practical.
Note Flow instances that have no shared dependencies will be executed in parallel.
Additionally, a Cascade allows for incremental builds of complex data processing processes. If a given source {@link Tap} is newer thana subsequent sink {@link Tap} in the assembly, the connecting {@link cascading.flow.Flow}(s) will be executed when the Cascade executed. If all the targets (sinks) are up to date, the Cascade exits immediately and does nothing.
The concept of 'stale' is pluggable, see the {@link cascading.flow.FlowSkipStrategy} class.
When a Cascade starts up, if first verifies which Flow instances have stale sinks, if the sinks are not stale, the method {@link cascading.flow.BaseFlow#deleteSinksIfNotUpdate()} is called. Before appends/updates were supported (logically)the Cascade deleted all the sinks in a Flow.
The new consequence of this is if the Cascade fails, but does complete a Flow that appended or updated data, re-running the Cascade (and the successful append/update Flow) will re-update data to the source. Some systems may be idempotent and may not have any side-effects. So plan accordingly.
Use the {@link CascadeListener} to receive any events on the life-cycle of the Cascade as it executes. Any{@link Tap} instances owned by managed Flows also implementing CascadeListener will automatically be added to theset of listeners. @see CascadeListener @see cascading.flow.Flow @see cascading.flow.FlowSkipStrategy

    // connect up both sinks using the same exportPipe assembly
    Flow exportFromUrl = flowConnector.connect( "export url", sinkUrl, localSinkUrl, exportPipe );
    Flow exportFromWord = flowConnector.connect( "export word", sinkWord, localSinkWord, exportPipe );


    // connect up all the flows, order is not significant
    Cascade cascade = new CascadeConnector().connect( importPagesFlow, count, exportFromUrl, exportFromWord );


    // run the cascade to completion
    cascade.complete();
    }

View Full Code Here


    // optionally print out the arrivalRateFlow to a graph file for import into a graphics package
    //arrivalRateFlow.writeDOT( "arrivalrate.dot" );


    // connect the flows by their dependencies, order is not significant
    Cascade cascade = cascadeConnector.connect( importLogFlow, arrivalRateFlow );


    // execute the cascade, which in turn executes each flow in dependency order
    cascade.complete();
    }

View Full Code Here

    Tap sink2 = getPlatform().getTextFile( getOutputPath( "flowstats2" ), SinkMode.REPLACE );


    Flow flow1 = getPlatform().getFlowConnector().connect( "stats1 test", source, sink1, pipe );
    Flow flow2 = getPlatform().getFlowConnector().connect( "stats2 test", source, sink2, pipe );


    Cascade cascade = new CascadeConnector().connect( flow1, flow2 );


    cascade.complete();


    CascadeStats cascadeStats = cascade.getCascadeStats();


    assertNotNull( cascadeStats.getID() );


    // unsure why this has changed
//    if( getPlatform() instanceof HadoopPlatform )

View Full Code Here

    ProcessFlow fourthProcess = new ProcessFlow( "fourth", fourth );


    LockingFlowListener flowListener = new LockingFlowListener();
    secondProcess.addListener( flowListener );


    Cascade cascade = new CascadeConnector().connect( fourthProcess, secondProcess, firstProcess, thirdProcess );


    cascade.start();


    cascade.complete();


    assertTrue( "did not start", flowListener.started.tryAcquire( 2, TimeUnit.SECONDS ) );
    assertTrue( "did not complete", flowListener.completed.tryAcquire( 2, TimeUnit.SECONDS ) );


    validateLength( fourth, 20 );

View Full Code Here


    Tap sink = getPlatform().getTextFile( getOutputPath( "mapperfail" ), SinkMode.REPLACE );


    Flow flow = getPlatform().getFlowConnector().connect( "mapper fail test", source, sink, pipe );


    Cascade cascade = new CascadeConnector().connect( flow );
    assertNull( cascade.getCascadeStats().getThrowable() );


    try
      {
      cascade.complete();
      fail( "An exception should have been thrown" );
      }
    catch( Throwable throwable )
      {
      CascadeStats cascadeStats = cascade.getCascadeStats();
      assertEquals( throwable, cascadeStats.getThrowable() );
      }
    }

View Full Code Here


    Tap sink = getPlatform().getTextFile( getOutputPath( "reducerfail" ), SinkMode.REPLACE );


    Flow flow = getPlatform().getFlowConnector().connect( "reducer fail test", source, sink, pipe );


    Cascade cascade = new CascadeConnector().connect( flow );
    assertNull( cascade.getCascadeStats().getThrowable() );
    try
      {
      cascade.complete();
      fail( "An exception should have been thrown" );
      }
    catch( Throwable throwable )
      {
      CascadeStats cascadeStats = cascade.getCascadeStats();
      assertEquals( throwable, cascadeStats.getThrowable() );
      }
    }

View Full Code Here


    Flow flowKeyValue = getPlatform().getFlowConnector( getProperties() ).connect( source, tapKeyValue, pipe );
    Flow flowKey = getPlatform().getFlowConnector( getProperties() ).connect( tapKeyValue, tapKey, new Pipe( "key" ) );
    Flow flowValue = getPlatform().getFlowConnector( getProperties() ).connect( tapKeyValue, tapValue, new Pipe( "value" ) );


    Cascade cascade = new CascadeConnector().connect( "keyvalues", flowKeyValue, flowKey, flowValue );


    cascade.complete();


    validateLength( flowKeyValue, 10, 2 );
    validateLength( flowKey, 10, 1 );
    validateLength( flowValue, 10, 1 );
    }

View Full Code Here

    Tap sink2 = getPlatform().getTextFile( getOutputPath( "flowstats2" ), SinkMode.REPLACE );


    Flow flow1 = getPlatform().getFlowConnector().connect( "stats1 test", source, sink1, pipe );
    Flow flow2 = getPlatform().getFlowConnector().connect( "stats2 test", source, sink2, pipe );


    Cascade cascade = new CascadeConnector().connect( flow1, flow2 );


    cascade.complete();


    CascadeStats cascadeStats = cascade.getCascadeStats();


    assertNotNull( cascadeStats.getID() );


    assertEquals( 1, cascadeStats.getCounterGroupsMatching( "cascading\\.stats\\..*" ).size() );
    assertEquals( 2, cascadeStats.getCountersFor( TestEnum.class.getName() ).size() );

View Full Code Here

    Tap sink2 = getPlatform().getTextFile( getOutputPath( "flowstats2" ), SinkMode.REPLACE );


    Flow flow1 = getPlatform().getFlowConnector().connect( "stats1 test", source, sink1, pipe );
    Flow flow2 = getPlatform().getFlowConnector().connect( "stats2 test", source, sink2, pipe );


    Cascade cascade = new CascadeConnector().connect( flow1, flow2 );


    cascade.complete();


    CascadeStats cascadeStats = cascade.getCascadeStats();


    assertNotNull( cascadeStats.getID() );


    // unsure why this has changed
//    if( getPlatform() instanceof HadoopPlatform )

View Full Code Here


    Tap nextSink = new Hfs( new TextLine(), getOutputPath( "glob2" ), SinkMode.REPLACE );


    Flow nextFlow = getPlatform().getFlowConnector( getProperties() ).connect( "second", sink, nextSink, concatPipe );


    Cascade cascade = new CascadeConnector().connect( concatFlow, nextFlow );


    cascade.complete();


    validateLength( concatFlow, 10 );
    }

View Full Code Here

0 1

TOP

Related Classes of cascading.cascade.Cascade

cascading.cascade.hadoop.RiffleCascadePlatformTest

cascading.cascade.planner.TapGraph

cascading.flow.Flow

cascading.flow.hadoop.MapReduceFlowPlatformTest

cascading.management.CascadingServices

cascading.scheme.hadoop.WritableSequenceFilePlatformTest

cascading.stats.CascadeStats

cascading.stats.hadoop.CascadingStatsLocalHadoopErrorPlatformTest

cascading.stats.hadoop.HadoopStatsPlatformTest

cascading.stats.local.CascadingStatsPlatformTest

All source code are property of their respective owners. Java is a trademark of Sun Microsystems, Inc and owned by ORACLE Inc. Contact coftware#gmail.com.