Examples of com.persistit.Transaction

edia.org/wiki/Snapshot_isolation">Snapshot Isolation for high concurrency and throughput. This protocol is optimistic in that competing concurrent transactions run at full speed without locking, but can arrive in a state where not all transactions can be allowed to commit while preserving correct semantics. In such cases one or more of the transactions must roll back (abort) and retry. These topics are covered in greater detail below.

The Transaction object itself is not thread-safe and may be accessed and used by only one thread at a time. However, the database operations being executed within the scope defined by begin and end are capable of highly concurrent execution.

Lexical Scoping

Applications must manage the scope of any transaction by ensuring that whenever the begin method is called, there is a concluding invocation of the end method. The commit and rollback methods do not end the scope of a transaction; they merely signify the transaction's intended outcome when end is invoked. Applications should follow either the try/finally or the TransactionRunnable pattern to ensure correctness.

Commit Policy

Persistit provides three policies that determine the durability of a transaction after it has executed the commit method. These are:

{@link CommitPolicy#HARD}: The commit method does not return until all updates created by the transaction have been written to non-volatile storage (e.g., disk storage).
{@link CommitPolicy#GROUP}: The commit method does not return until all updates created by the transaction have been written to non-volatile storage. In addition, the committing transaction waits briefly in an attempt to recruit other concurrently running transactions to write their updates with the same physical I/O operation.
{@link CommitPolicy#SOFT}: The commit method returns before the updates have been recorded on non-volatile storage. Persistit attempts to write them within 100 milliseconds, but this interval is not guaranteed.

You can specify a default policy in the Persistit initialization properties using the {@value com.persistit.Configuration#COMMIT_POLICY_PROPERTY_NAME}property or under program control using {@link Persistit#setDefaultTransactionCommitPolicy} . The default policyapplies to the {@link #commit()} method. You can override the default policyusing {@link #commit(CommitPolicy)}.

HARD and GROUP ensure each transaction is written durably to non-volatile storage before the commit method returns. The difference is that GROUP can improve throughput somewhat when many transactions are running concurrently because the average number of I/O operations needed to commit N transactions can be smaller than N. However, for one or a small number of concurrent threads, GROUP reduces throughput because it works by introducing a delay to allow other concurrent transactions to commit within a single I/O operation.

SOFT commits are generally much faster than HARD or GROUP commits, especially for single-threaded applications, because the results of numerous update operations can be aggregated and written to disk in a much smaller number of I/O operations. However, transactions written with the SOFT commit policy are not immediately durable and it is possible that the recovered state of a database will be missing transactions that were committed shortly before a crash.

For SOFT commits, the state of the database after restart is such that for any committed transaction T, either all or none of its modifications will be present in the recovered database. Further, if a transaction T2 reads or updates data that was written by any other transaction T1, and if T2 is present in the recovered database, then so is T1. Any transaction that was in progress, but had not been committed at the time of the failure, is guaranteed not to be present in the recovered database. SOFT commits are designed to be durable within 100 milliseconds after the commit returns. However, this interval is determined by computing the average duration of recent I/O operations to predict the completion time of the I/O that will write the transaction to disk, and therefore the interval cannot be guaranteed.

Optimistic Concurrent Scheduling - MVCC

Persistit schedules concurrently executing transactions optimistically, without locking any database records. Instead, Persistit uses a well-known protocol called Snapshot Isolation to achieve atomicity and isolation. While transactions are modifying data, Persistit maintains multiple versions of values being modified. Each version is labeled with the commit timestamp of the transaction that modified it. Whenever a transaction reads a value that has been modified by other transactions, it reads the latest version that was committed before its own start timestamp. In other words, all read operations are performed as if from a "snapshot" of the state of the database made at the transaction's start timestamp - hence the name "Snapshot Isolation."

Pruning

Given that all updates written through transactions are created as versions within the MVCC scheme, a large number of versions can accumulate over time. Persistit reduces this number through an activity called "version pruning." Pruning resolves the final state of each version by removing any versions created by aborted transactions and removing obsolete versions no longer needed by currently executing transactions. If a value contains only one version and the commit timestamp of the transaction that created it is before the start of any currently running transaction, that value is called primordial. The goal of pruning is to reduce almost all values in a Persistit Tree to their primordial states because updating and reading primordial values is more efficient than the handling required for multiple version values. Pruning happens automatically and is generally not visible to the application.

Rollbacks

Usually Snapshot Isolation allows concurrent transactions to commit without interference but this is not always the case. Two concurrent transactions that attempt to modify the same Persistit key-value pair before committing are said to have a "write-write dependency". To avoid anomalous results one of them must abort, rolling back any other updates it may have created, and retry. Persistit implements a "first updater wins" policy in which if two transactions attempt to update the same record, the first transaction "wins" by being allowed to continue, while the second transaction "loses" and is required to abort.

Once a transaction has aborted, any subsequent database operation it attempts throws a {@link RollbackException}. Application code should generally catch and handle the RollbackException. Usually the correct and desired behavior is simply to retry the transaction. See try/finally for a code pattern that accomplishes this.

A transaction can also voluntarily roll back. For example, transaction logic could detect an error condition that it chooses to handle by throwing an exception back to the application. In this case the transaction should invoke the {@link #rollback} method to explicit declare its intent to abort thetransaction.

Read-Only Transactions

Under Snapshot Isolation, transactions that read but do not modify data cannot generate any write-write dependencies and are therefore not subject to being rolled back because of the actions of other transactions. However, note that even if it modifies no data, a long-running transaction can force Persistit to retain old value versions for its duration in order to provide a snapshot view. This behavior can cause congestion and performance degradation by preventing very old values from being pruned. The degree to which this is a problem depends on the volume of update transactions being processed and the duration of long-running transactions.

The try/finally/retry Code Pattern

The following code fragment illustrates a transaction executed with up to to RETRY_COUNT retries. If the commit method succeeds, the whole transaction is completed and the retry loop terminates. If after RETRY_COUNT retries commit has not been successfully completed, the application throws a TransactionFailedException.

 Transaction txn = Persistit.getTransaction(); int remainingAttempts = RETRY_COUNT; for (;;) { txn.begin();         // Begin transaction scope try { // // ...perform Persistit fetch, remove and store operations... // txn.commit();     // attempt to commit the updates break;            // Exit retry loop } catch (RollbackException re) { if (--remainingAttempts < 0) { throw new TransactionFailedException();  { } finally { txn.end();       // End transaction scope. Implicitly  // roll back all updates unless // commit completed successfully. } }

The TransactionRunnable Pattern

As an alternative, the application can embed the actual database operations within a {@link TransactionRunnable} and invoke the {@link #run} method toexecute it. The retry logic detailed in the fragment shown above is handled automatically by run; it could be rewritten as follows:

 Transaction txn = Persistit.getTransaction(); txn.run(new TransactionRunnable() { public void runTransaction() throws PersistitException, RollbackException { // //...perform Persistit fetch, remove and store operations... // } }, RETRY_COUNT, 0);

Nested Transaction Scope

Persistit supports nested transactions by counting the number of nested {@link #begin} and {@link #end} operations. Each invocation ofbegin increments this count and each invocation of end decrements the count. These methods are intended to be used in a standard essential pattern, shown here, to ensure that the scope of of the transaction is reliably determined by the lexical the structure of the code rather than conditional logic:

 txn.begin(); try { // // Application transaction logic here, possibly including  // invocation of methods that also call txn.begin() and // txn.end(). // txn.commit(); } finally { txn.end(); }

This pattern ensures that the transaction scope is ended properly regardless of whether the application code throws an exception or completes and commits normally.

The {@link #commit} method performs the actual commit operation only when thecurrent nested level count (see {@link #getNestedTransactionDepth()}) is 1. That is, if begin has been invoked N times, then commit will actually commit the data only when end is invoked the Nth time. Data updated by an inner (nested) transaction is never actually committed until the outermost commit is called. This permits transactional code to invoke other code (possibly an opaque library supplied by a third party) that may itself begin and commit transactions.

Invoking {@link #rollback} removes all pending but uncommitted updates andmarks the current transaction scope as rollback pending. Any subsequent attempt to perform any Persistit operation, including commit in the current transaction scope, will fail with a RollbackException.

Application developers should beware that the {@link #end} method performs animplicit rollback if commit has not completed. Therefore, if an application fails to call commit, the transaction will silently fail. The end method sends a warning message to the log subsystem when this happens, but does not throw an exception. The end method is designed this way to allow an exception thrown within the application code to be caught and handled without being obscured by a RollbackException thrown by end. But as a consequence, developers must carefully verify that the end method is always invoked whether or not the transaction completes normally.

Step Index: Controlling Visibility of Uncommitted Updates

By default, application logic within the scope of a transaction can read two kinds of values: those that were committed by other transactions prior to the start of the current transaction (from the "snapshot") and those that were modified by the transaction itself. However, in some applications it is useful to control the visibility of modifications made by the current transaction. For example, update queries that select records to update and then change the very values used as selection criteria can produce anomalous results. See Halloween Problem for a succinct description of this issue. Persistit provides a mechanism to control visibility of a transaction's own modifications to avoid this problem.

While a transaction is executing, every updated value it generates is stored within a multi-version value and labeled with the transaction ID of the transaction that produced it and a small integer index (0-99) called the step.

The current step index is an attribute of the Transaction object available from {@link #getStep}. The begin method resets its value to zero. An application can invoke {@link #incrementStep} to incrementit, or {@link #setStep} to control its current value. Modifications createdby the transaction are labeled with the current step value.

When reading data, modifications created by the current transaction are visible to Persistit if and only if the step number they were assigned is less or equal to the Transaction's current step number. An application can take advantage of this by controlling the current step index, for example, by reading data using step 0 while posting updates with a step value of 1.

Thread Management

As noted above, a Transaction typically belongs to one thread for its entire lifetime and is not threadsafe. However, to support server applications which may manage a large number of sessions among a smaller number of threads, Persisit allows an application to manage sessions explicitly. See {@link Persistit#getSessionId()} and{@link Persistit#setSessionId(SessionId)}. The method {@link Persistit#getTransaction()} is sensitive to the thread's currentSessionId, and therefore the following style of interaction is possible:

Thread T1 is assigned work for session S.
Thread T1 invokes begin, does some work and then returns control to a client.
Thread T2 receives additional work to perform on behalf of session S.
Thread T2 sets its current SessionId to session S
Thread T2 then uses {@link Persistit#getTransaction()} to acquire thesame transaction context previously started by T1.
Thread T2 does additional work and then calls commit and end to complete the transaction.

Applications that use this technique must be written carefully to ensure that multiple threads never execute with the same SessionId. Concurrent access to a Transaction or Exchange can cause serious errors, including database corruption.

Additional Notes

Optimistic concurrency control works well when the likelihood of conflicting transactions - that is, concurrent execution of two or more transactions that modify the same database records - is low. For most applications this assumption is valid.

For best performance, applications in which multiple threads frequently operate on overlapping data such that roll-backs are likely, the application should implement its own locks to prevent or reduce the likelihood of collisions.

An application can examine counts of commits, rollbacks and rollbacks since the last successful commit using {@link #getCommittedTransactionCount()}, {@link #getRolledBackTransactionCount()} and{@link #getRolledBackSinceLastCommitCount()}, respectively.

@author peter @version 1.1