table: The name of the table in Hbase to write to.
columnFamily: The column family in Hbase to write to.
This sink will commit each transaction if the table's write buffer size is reached or if the number of events in the current transaction reaches the batch size, whichever comes first.
Other optional parameters are:
serializer: A class implementing {@link HBaseEventSerializer}. An instance of this class will be used to write out events to hbase.
serializer.*: Passed in the configure() method to serializer as an object of {@link org.apache.flume.Context}.
batchSize: This is the batch size used by the client. This is the maximum number of events the sink will commit per transaction. The default batch size is 100 events.
Note: While this sink flushes all events in a transaction to HBase in one shot, Hbase does not guarantee atomic commits on multiple rows. So if a subset of events in a batch are written to disk by Hbase and Hbase fails, the flume transaction is rolled back, causing flume to write all the events in the transaction all over again, which will cause duplicates. The serializer is expected to take care of the handling of duplicates etc. HBase also does not support batch increments, so if multiple increments are returned by the serializer, then HBase failure will cause them to be re-written, when HBase comes back up.
|
|
|
|