Software Transactional Memory III - Making Transactions Atomic

Thursday, July 5, 2007

Now that the basic data unit of my .NET Software Transactional Memory (NSTM) has been introduced - transacational objects (txo) aka INstmObject - who implement the Isolation property of transactions, the question is, where Atomicity comes from. Enter: the transaction log.

Recording Memory Interactions

The transaction log (txlog) records all objects you tackle during during a transaction. Whenever you write to a txo that´s logged in the txlog. Whenever you read from a txo that´s logged in the txlog. So the txlog contains a list of all objects interacted with in a transaction including their current values. Even though my previous posting might have suggested txo maintain a clone for their value it is in fact the txlog attached to each transaction.

When you read from or write to a txo you don´t really directly access its value. Rather the INstmObject goes to the current transaction and asks it what to do, which value to return or where to store a new value. The transaction then consults with its transaction log:

If an object is accessed for the first time during the transaction a log entry for it is added to the txlog.
If a txo is written to, the new value is put into its txlog entry instead of the txo itself. This is to isolate changes made to the same txo in different transactions from each other.
If a txo is read from, the transaction checks which value to return. If a new value is already present then that´s chosen. If no new value has been assigned the current value is returned. Either the real current value from the txo - or the current value as cloned on the first read access if the clone option is CloneOnRead.
Also the txo is validated if the transaction´s isolation level is Serializable. That means the current version number of the txo is compared to the version number when it was first access during the transaction. Validation fails if those versions do not match, which means some other transaction has committed changes to the object in the meantime. This is to avoid inconsistencies in the form of different values read from the same txo during a transaction. If you want to allow such changes then set the isolation level to ReadCommitted.

Any changes to transactional objects during a transaction are accumulated in the transaction log. Txo are thus never changed directly by an application. This provides Isolation and the first half of Atomicity: nothings happens to transactional objects if a transaction fails. Because if it fails, all changes recorded in the txlog are lost.

Ending a Transaction

A transaction can be ended in two ways: either by rolling it back and discarding all changes or by committing it.

Rolling back is easy: the transaction log simply is discarded. That´s it. No further effort is needed. No locks were helt on txo which would need unlocking. No changes were made which would need to be undone.

Committing a transaction on the other hand is a two step process:

First all txo read from with more than just PassingReadOnly mode are validated. (Currently this is also true for ReadWrite mode txo, but I´m unsure if that´s necessary. Also currently I´m not content with how to switch between validation on Commit() only and validation on each Read().) During validation all transactional objects opened in ReadWrite mode also are locked. This is to freeze the current view on transactional memory for the duration of the commit. No other transactions must commit at the same time to the same txo.
Where locking comes into play deadlocks need to be avoided. Therefore all txos are kept in a sorted list so each transaction would lock them in the same order. This is a common way to give deadlocks no chance.
If any transactional object cannot be validated the commit is aborted and the transaction is rolled back.
Second all locked txos written to are visited again to copy their new values to the txo itself. At the same time the version number of each txo is incremented to allow for easy optimistic locking aka validation. Afterwards the txo is unlocked.

By locking modified txos (for a very short time) during commit Atomicity is ensured. An application either sees no changes at all when a transaction is rolled back - or all changes at once after Commit() has finished and all modified objects have been updated and unlocked.

What´s next?

Now that I´ve explained how a single transaction works it´s time to look at how mutiple concurrent transactions on one or more threads are managed. Stay tuned if you are interested to see how NSTM implements truely nested transactions.

Very nifty! Great articles, looking forward to the rest!

However after reading through the code, a few questions surfaced in my head:

- I haven't seen an optimization for journalling blocks (Arrays). Do you plan to include those in the future?

- Since you've used your own API to create the collection, how did you find its usabillity? How would a novice percieve it?
...

Cheers,

Cornelius van Berkel - Thursday, July 5, 2007 8:42:14 PM

@Cornelius: What do you mean by "journalling blocks"?

The API of my collections tries to focus on the most important operations of each data structure. I did not strive for completeness or tried to mimic the .NET collections as closely as possible. Rather I wanted to arrive as quickly as possible at workable data structures of different kinds to be able to play around with NSTM on a higher level.

If NSTM proves to be useful I´ll invest more time in fleshing out the collections.

-Ralf

PS: I tried to open your website berkelsoftware.nl, but it seems to be down.

ralfw - Friday, July 6, 2007 8:58:28 AM

Recording Memory Interactions

Ending a Transaction

What´s next?

2 Comments