As a generalist, what must I know regarding ‘transactions’

I believe, a generalist, or a team of generalists, must offer these skills, related to the implementation of ‘business transactions‘.

The failure of a business transaction must still leave the system in a safe, and valid state. As a ‘generalist’ I must either know how to achieve that goal, or know where to quickly find a solution.


As always, I want to be able to solve this problem in two platforms – the Java eco-ssytem, and Node.js.


Short-lived business transactions

Most of us are familiar with ACID transaction support in relational databases.  Relying on this support is only recommended for short-lived business transactions.

Database transactions are typically implemented by locking database tables for a particular user, which forces all others to wait till the locks are released. This necessarily slows the system down, among other complexities. Hence the recommendation that ACID transactions be very short-lived.

Here are some examples of short-lived business transactions.

  • Change the address of a customer. Typically you have already gathered the new address, and you simply have to update a few tables with the new data.
  • Apply a payment against a policy. Again, the whole transaction is typically an update to a few tables.

In the Java eco-system, this sort of short-lived transaction, when applied against a single database, is implemented with the JDBC API. We must be able to able that, using Java, Scala, and Groovy.

I must be able to talk to relational databases, and manage ACID transactions against a single database, using node.js.


Long-lived business transactions

However, often, there are business transactions that are long-lived, and must behave gracefully.

Here is one example of a long-lived transaction – migrating old insurance policies from one system to another.

This is historical data, often several years worth and can be voluminous. It contains many different parts, like contacts, coverages, changes made to the policy over the years, documents, etc. Accepting all of the policy into a new system can take a significant time. We used to run into policies that took 20 seconds to process completely. Things will go wrong, and when they do, you would like to cleanly rollback all of the incoming data. However, you cannot keep an ACID database transaction open for 20 second, 10, or even 5 seconds. That locks up database tables, which in turn will severely diminish your ability to handle load.

Here is another example – business workflows that extend over several days.

They are initiated, passed around to several folks, and then eventually completed. If for some reason this business process ends in some kind of rejection, or failure, you may want to discard, or perhaps archive data that this workflow created.

Consider that new accident information is received for an automobile policy. Maybe some documents are uploaded. Some premium adjustments are made. Some bills are generated. Underwriting, and billing managers review, and sign off. This process may take several days to finish. Say after doing a lot of work, you discover all this work was done on the wrong policy – perhaps an ex-husband’s. How do you roll back this work, and data that has been accumulating over several days? Surely not with a database transaction.

So how do ensure that such operations are well behaved? If necessary, how to ensure that these long-lived business transactions exhibit ACID properties?

As a ‘generalist’ I must know standard approaches, and solutions to this problem. Further, I must be able to implement these solutions in the Java eco-system, and in node.js.


Single database vs. multiple

In a large, heterogenous environment, you are often working with several databases. Perhaps all documents are in some legacy SQL Server DB, and day to day transactional data are in a fast MySQL DB.

How do you implement ACID when a business transaction, even a short-lived one, works with data that is distributed across more than one database?

In the Java world, there is the JTA API, which supports so-called ‘distributed transactions’. But very few people seem to use this. As a ‘generalist’ I must know what the alternative is.

Similarly, I must know how this problem can be handled in node.js.


Polyglot persistence

This is just a special case of the multiple database scenario. The data repository can be anything at all – relational DB, NoSQL DB, text based index, messaging end point, flat disk file, etc.

How do you implement ACID when the business transaction works with many different kinds of data repositories?

For instance, say you are recording an auto accident. Photos of the car might go into a noSQL database, like MongoDB. A description of the accident is saved to Oracle, and this information is also parsed and pushed into Lucene. Finally, a notification is dropped into a queue that a claims adjuster is watching. If this transaction dies for some reason, you want to rollback the changes you made to each of these very disparate data repositories.