HRegion stores data for a certain region of a table. It stores all columns for each row. A given table consists of one or more HRegions.
We maintain multiple HStores for a single HRegion.
An Store is a set of rows with some column data; together, they make up all the data for the rows.
Each HRegion has a 'startKey' and 'endKey'.
The first is inclusive, the second is exclusive (except for the final region) The endKey of region 0 is the same as startKey for region 1 (if it exists). The startKey for the first region is null. The endKey for the final region is null.
Locking at the HRegion level serves only one purpose: preventing the region from being closed (and consequently split) while other operations are ongoing. Each row level operation obtains both a row lock and a region read lock for the duration of the operation. While a scanner is being constructed, getScanner holds a read lock. If the scanner is successfully constructed, it holds a read lock until it is closed. A close takes out a write lock and consequently will block for ongoing operations and will block new operations from starting while the close is in progress.
An HRegion is defined by its table and its key extent.
It consists of at least one Store. The number of Stores should be configurable, so that data which is accessed together is stored in the same Store. Right now, we approximate that by building a single Store for each column family. (This config info will be communicated via the tabledesc.)
The HTableDescriptor contains metainfo about the HRegion's table. regionName is a unique identifier for this HRegion. (startKey, endKey] defines the keyspace for this HRegion.