IEEE CLOUD'16 : Optimized Durable Commitlog for Apache Cassandra Using CAPI-Flash


—High-velocity data imposes high durability overheads on Big Data technology components such as NoSQL data stores. In Apache Cassandra, a widely used NoSQL solution with high scalability and availability, write-ahead logging is used to support Commitlog operations, which in turn provides fault tolerance to applications. However, current write-ahead logging techniques are limited by the excessive overhead in the I/O subsystem. To address this performance gap, we have designed a novel CAPI-Flash based high performance durable Commitlog for Apache Cassandra. We take advantage of the high throughput, low latency path to flash storage provided by the Coherent Accelerator Processor Interface (CAPI) on IBM POWER8 Systems. Our experimental results show that for write-intensive workloads CAPI-Flash logging provides up to 107% improvement in throughput compared to Cassandra’s durable alternative. We also provide 77% better throughput in updatemostly workloads.