A Linearizable and Partition-tolerant Scalable Key-Value Store |
IntroductionCATS is a distributed key-value store that guarantees linearizability and partition tolerance in partially synchronous and dynamic network conditions. CATS is scalable, elastic, and self-organizing; key properties for modern cloud storage middleware. Distributed key-value stores, such as Cassandra and Dynamo, employ principles from Distributed Hashtables (DHTs) to build scalable and self-managing data stores. In contrast to CATS, these systems chose availability over atomic consistency, hence only providing eventual consistency . While eventual consistency is sufficient for some applications, the complexities of merging divergent replicas can be non-trivial. We avoid the complexities entailed by eventual consistency while providing scalable storage for critical applications which need atomic consistency, guaranteeing it at the cost of a modest decrease in throughput. Our system evaluation shows that consistency can be achieved with practical performance and modest overhead: 5% decrease in throughput for read-intensive workloads, and 25% throughput loss for write-intensive workloads. CATS delivers submillisecond operation latencies under light load, single-digit millisecond operation latencies at 50% load, and it sustains a throughput of one thousand operations per second, per server, while scaling linearly to hundreds of servers. CATS employs decentralized P2P techniques, thus enabling unlimited scalability. It uses consistent hashing for distributing storage responsibilities amongst the nodes, and for enabling self-organization. Read and Write (Get and Put) operations use ABD, extended with Consistent Quorums, and replication groups are reconfigured using Paxos. CATS is implemented in Java using the Kompics framework. The project was carried out at KTH - Royal Institute of Technology, and the Swedish Institute of Computer Science in Sweden. |
PeopleThere have been significant contributions from master thesis students to add features (including persistence, recovery, range queries) to the original in-memory CATS implementation. These include:
|
Publications
|
Source CodeSource code for the in-memory implementation of CATS can be downloaded from here. A public git clone URL is coming soon.An extended version of CATS, with features including persistence, crash-recovery, and range queries, can be downloaded from here. |
StatusScalable storage systems are important building blocks for today's web-scale applications. While CATS will be extended with more features, there are follow-up projects on storage systems in our lab. These include Caracal DB (contact Lars or Alex), and LLD (contact Jim). |
|