A Cascade is an assembly of {@link cascading.flow.Flow} instances that share or depend on equivalent {@link Tap} instances and are executed asa single group. The most common case is where one Flow instance depends on a Tap created by a second Flow instance. This dependency chain can continue as practical.
Note Flow instances that have no shared dependencies will be executed in parallel.
Additionally, a Cascade allows for incremental builds of complex data processing processes. If a given source {@link Tap} is newer thana subsequent sink {@link Tap} in the assembly, the connecting {@link cascading.flow.Flow}(s) will be executed when the Cascade executed. If all the targets (sinks) are up to date, the Cascade exits immediately and does nothing.
The concept of 'stale' is pluggable, see the {@link cascading.flow.FlowSkipStrategy} class.
When a Cascade starts up, if first verifies which Flow instances have stale sinks, if the sinks are not stale, the method {@link cascading.flow.BaseFlow#deleteSinksIfNotUpdate()} is called. Before appends/updates were supported (logically)the Cascade deleted all the sinks in a Flow.
The new consequence of this is if the Cascade fails, but does complete a Flow that appended or updated data, re-running the Cascade (and the successful append/update Flow) will re-update data to the source. Some systems may be idempotent and may not have any side-effects. So plan accordingly.
Use the {@link CascadeListener} to receive any events on the life-cycle of the Cascade as it executes. Any{@link Tap} instances owned by managed Flows also implementing CascadeListener will automatically be added to theset of listeners.
@see CascadeListener
@see cascading.flow.Flow
@see cascading.flow.FlowSkipStrategy