CRDT - Replicated Data Types
Wikipedia has a generally good introductory article on CRDTs. Conflict-free replicated data types are subdivided into two main subtypes: op-based Commutative (CmRDT) and state-based Convergent (CvRDT). CvRDT was originally defined in terms of merging state snapshots. CmRDT was defined in terms of applying atomic ops to the state.
Although the two types are formally equivalent, there is a great difference from the engineering perspective. The op-based subtype needs a "causal broadcast" layer that provides exactly-once delivery guarantees for mutations (ops). The state-based subtype needs more of per-object metadata to merge any two versions of that object.
Despite the differences, there was a great deal of convergence between those two types.
CmRDT (op-based)
The top difficulty of the op-based approach is the exactly-once op delivery. We believe that difficulty to be greatly exaggerated. From the standpoint of some abstract lossy message-passing environment that may be hard indeed. Practically, many of those problems are reliably solved in existing systems. The Swarm solves exactly-once delivery by combining:
- unique op identifiers,
- a log-structured storage engine and
- a TCP-like reliable transport.
Technically, the key advantage of CmRDT is the efficient propagation of mutations. One op only carries its change, and not the rest of the object's state. In line with the classic state machine replication model, CmRDT may compress its op log into a state snapshot.
The biggest win is architectural: orthogonality of concerns. Ops are immutable. The broadcast layer can store and forward ops based on generic type-independent rules. As the broadcast solves delivery issues, data type implementation is greatly simplified too.
CvRDT (state-based)
CvRDTs are rather forgiving to the delivery order. Actually, the mutation delivery order does not matter at all. The cost of such a resilience is more of per-object metadata. Objects can only be delivered as a full state snapshot, including all of the metadata. That is especially painful for client-side use. Just imagine that your Google Doc re-downloads itself on every your keystroke (Google Docs is op-based, but not CRDT).
Delta-enabled CvRDT resolve that inefficiency. Deltas take the middle ground between state and ops. They only carry the changed state. The cost of efficiency is the new requirement: at-least-once delivery. That puts "delta-enabled state-based" much closer to "snapshot-enabled op-based" in terms of abilities and requirements.
Some less-severe issues remain:
- delta handling is type-specific,
- deltas are mutable, and
- objects still carry metadata for state-to-state merge.
The main deficiency of CvRDT is also architectural. There is no clear separation of concerns between the transport/broadcast layer and the CRDT/math layer. They entangle.