Key Value Store
A key-value store is a non-relational database. Keys must be unique. Values associated with a key can be accessed through the key. Keys can be plain text or hashed values.
Some common key-value stores are Amazon DynamoDB, Redis, Azure CosmosDB, and Memcached.
Key-value stores usually support these common operations:
- Get(key)
- Put(key,value)
- Delete(key)
When designing a distribued key-value store, it is important to understand CAP theorem. See here for more details: CAP Theorem
Key-value stores are classified based on the two CAP characteristics they support:
- CP: consistency and partition tolerance
- AP: Availability and partition tolerance
- CA: consistency and Availability. Since network failure is unavoidable, a distributed system must tolerate network partitions. Therefore, a CA system cannot exist in a real-world application.
Partitioning Data in Key-Value stores
Consistent hashing should be used to distribute data between servers in a key-value store. Refer here for more info: Consistent Hashing
Data Replication
To achieve high-availability and reliability, data must be replicated to n servers, where n is a configurable number. Data is replicated to these servers using the following method: walk along the hash ring, storing data in the first n servers found.
Consistency
Since data is replicated to multiple nodes, it must be synchronized across all replicas. Quorum can guarantee consistency for both read and write operations.
Let us first establish some definitions before diving deeper into consistency:
- n: number of replicas
- r: a read quorum of size r
- w: a write quorum of size w
w1 means that the coordinator must receive at least one acknowledgement before the write is considered successful. For example, if we have 3 replicas (s0,s1, and s2), and s1 acknowledges a write operation, we no longer need to wait for s0 or s2. The same rule can be applied to reads.
The configuration of w, r, and n is a typical trade off between latency and consistency. If w=1 or r=1, an operation is returned quickly because the coordinator only needs to wait for a response from one replica. If w or r > 1, the system offers better consistency, but increased latency. If w + r > n, strong consistency is guaranteed because there must be at least one overlapping node that has the latest data to ensure consistency.
If r = 1 and w = n, the system is optimized for fast reads. If w = 1 and r = n, the system is optimized for fast writes. If w + r > n, strong consistency is guaranteed. If w + r <= n, strong consistency is not guaranteed.
Consistency Models
- Strong: client never see’s out of date data
- Weak: read operations may not see the most updated value
- Eventual: A form of weak consistency. Given enough time, all writes are propagated, and all replicas are consistent.