Red Hat Developer Program

Transactions in the Cloud

For this meetup we have two short talks, both with a "Transactions in the Cloud" theme. The talks cover some novel research undertaken by this summer's intern students. Transactions and NoSQL Review It is a common to forgo the use of transactions in an application due to untested performance or scalability concerns. While it is true that transactions do come at a cost, they also bring a lot of strong guarantees essential to a correctly running application. There are many differing transaction options available, each with varying costs and guarantees. Rather than dismiss transactions altogether, it would be more prudent to consider how your application would perform under a number of different transaction models and select the option that hits the right balance of performance and guarantees.  This presentation provides a useful reference for developers and architects who need the guarantees that transactions can bring, but don't know which model is right for them. In particular the presentation will: 1) Review each of the transaction options available in NoSQL today. 2) Compare ACID and Extended (relaxed ACID) transaction models and understand how these models relate to NoSQL and scalability. 3) Present performance results for a selection of transaction options under different classes of workload. Covering a single node, as well as cluster of sharded nodes. Scaling a transaction manager in the Cloud Transaction coordination systems such as Narayana rely upon storage facilities to recreate transaction state during recovery from a system crash. These logs are usually stored on RAID hard drives to maximize reliability. However, any disk I/O is relatively slow and writing the logs can become a bottleneck in high performance transaction systems. Using the fastest available disk subsystems is thus an expensive necessity for many enterprise applications. The use of a cluster of inexpensive, unreliable hardware nodes in conjunction with data replication for redundancy is an increasingly popular and cost effective architecture for many highly scalable, high reliability applications. Data grids such as Infinispan provides distributed in-memory replication of Java objects, making programming for this kind of cluster relatively simple. This presentation will cover a new transaction logging plugin for Narayana that uses a data grid to replicate the logs to main memory on other nodes in a cluster, in preference to writing them to disk. In addition, the presentation will also cover performance benchmarks and reliability trade-offs.