com.sleepycat.je.tree.MapLN
A MapLN represents a Leaf Node in the JE Db Mapping Tree. MapLNs contain a DatabaseImpl, which in turn contain three categories of information - database configuration information, the per-database File Summary utilization information, and each database's btree root. While LNs are written to the log as the result of API operations which create new data records, MapLNs are written to the log as a result of configuration changes, utilization information changes, or updates to the btree which cascade up the tree and result in a new root. Because they serve as a bridge between the application data btree and the db mapping tree, MapLNs must be written with special rules, and should only be written from DbTree.modifyDbRoot. The basic rule is that in order to ensure that the MapLN contains the proper btree root, the btree root latch is used to protect both any logging of the MapLN, and any updates to the root lsn. Updates to the internal btree nodes obey a strict bottom up approach, in accordance with the log semantics which require that later log entries are known to supercede earlier log entries. In other words, for a btree that looks like MapLN | IN | BIN | LN we know that update operations cause the btree nodes must be logged in this order: LN, BIN, IN, MapLN, so that the reference to each on disk node is correct. (Note that logging order is special and different when the btree is initially created.) However, MapLNs may need to be written to disk at arbitrary points in time in order to save database config or utilization data. Those writes don't have the time and context to be done in a cascading-upwards fashion. We ensure that MapLNs are not erroneously written with an out of sync root by requiring that DbTree.modifyDbRoot takes the root latch for the application data btree. RootINs are also written with the root latch, so it serves to ensure that the root doesn't change during the time when the MapLN is written. For example, suppose thread 1 is doing a cascading-up MapLN write, and thread 2 is doing an arbitrary-point MapLN write: Thread 1 Thread 2 -------- -------- latch root latch BIN parent of MapLN log root IN log MapLN (Tree root) wants to log MapLN too -- but has to take to refer to new root IN root latch, so we'll get the right rootIN Without latching the root this could produce the following, incorrect log 30 LNa 40 BIN 50 IN (first version of root) 60 MapLN, refers to IN(50) ... 90 LNb 100 BIN 110 IN (second version of root) 120 CkptStart (the tree is not dirty, no IN will be logged during the ckpt interval)) .. something arbirarily writes out the MapLN 130 MapLN refers to first root, IN(50) <------ impossible While a MapLN can't be written out with the wrong root, it's possible for a rootIN to be logged without the MapLN, and for that rootIN not to be processed at recovery. Suppose a checkpoint begins and ends in the window between when a rootIN is written, and DbTree.modifyDbRoot is called: 300 log new root IN, update root reference in tree unlatch root 310 Checkpoint starts 320 Checkpoint ends ...if we crash here, before the MapLN is logged, , we won't see the new root IN at lsn 300. However, the IN is non-txnal and will be * recreated during reply of txnal * information (LNs) by normal recovery processing.