Consensus and Distributed Systems

Computing is at its simplest when there is a single machine. Your brother’s laptop can fail without making your phone crash. Distributed systems increase complexity by coordinating a set of connected machines. Consensus algorithms help coordinate and provide a sequence by which the distributed system comes to agreement.

The CAP theorem

The choice between safety and liveness in blockchains consensus protocols stems from the CAP theorem, which states that is impossible for a distributed system with read-write storage in an asynchronous network to provide all of the following guarantees: consistency, availability, and partition tolerance.

  • Consistency: Every read receives the most recent correct state update or an error.
  • Availability: Every read eventually receives a valid response without the guarantee it is the most recent update.
  • Partition tolerance: The network can continue to operate despite a communications failure.

Under normal circumstances, a network can provide both availability and consistency. However, when a partition occurs a decision must be made of which of the remaining two guarantees will be provided. Phrased differently, an asynchronous network with read-write storage can only provide consistency or availability in the presence of a partition.

Choosing availability over consistency will allow the network to always return a query, even if the query isn’t the most recent update. Choosing consistency over availability will make the network return an error if messages aren’t the latest update.

Other models of consistency and availability explore the tradeoff space between consistency and availability in the presence of partitions. If the system wants to prioritize strong availability, weaker consistency models can be provided in the presence of partitions. Eventual consistency is one such consistency model where all processes will eventually return the last updated value if no new updates are made to the given value.

Conversely, weaker availability can be provided if strong consistency is prioritized. A system can remain available for reads but not writes to preserve strong consistency during a partition.

The FLP impossibility

The FLP impossibility is another closely related impossibility result to the CAP theorem in distributed systems. The result states that for a deterministic fault-tolerant consensus protocol, consensus cannot be achieved in the asynchronous setting. The problem stems from the fact that a node cannot determine the difference between a crashed node and a delayed message as the asynchronous setting assumes that messages have an unbounded cap on delivery.

The impossibility result relates to the consensus problem, which states that each process has an initial value and that all correct processes must agree on a single value.

  • Agreement: All correct processes decide on the same value.
  • Termination: All correct processes eventually decide on a value.
  • Validity: All correct processes with the same initial value decide on that same value.

Because of the inability to distinguish between crashed nodes and delayed messages, termination cannot be achieved in the asynchronous setting. The impossibility result even applies to a very weak form of the consensus problem, where only some process is required to decide. However, only crash failures are assumed, and the message system is assumed to deliver all messages correctly and exactly once.

Many of the assumptions that the impossibility result makes are limiting and unrealistic for practical systems. To circumvent the impossibility result, the assumptions can be relaxed. A common solution is to change the communication model.

  • Synchronous: Messages will be received within a fixed amount of time.
  • Partially synchronous: Messages will be received within a fixed but unknown amount of time.
  • Asynchronous: Messages have an infinite amount of time to be delivered.

Changing from an asynchronous to a partially synchronous model, the protocol can satisfy termination because nodes can utilize timeouts to assume that a message won’t be delivered. Additionally, the partially synchronous model still assumes that messages can be arbitrarily dropped or delayed, rather than the strong assumptions made by the synchronous model that state messages will always reach their destination within a known and fixed amount of time.

Byzantine fault tolerance

Under the premise of open distributed systems, where processes can enter and exit in a permissionless nature, it is not enough to only assume crash failures as processes may be able to act arbitrarily against the rules of the protocol. A byzantine failure is a type of failure where there are no assumptions made about the way a byzantine process can act. Byzantine failures are particularly relevant to the concept of blockchains and permissionless systems, where anybody can enter with the ability to behave with malicious intent.

Byzantine fault tolerance (BFT) is the ability of a distributed system to tolerate byzantine failures. Importantly, byzantine fault tolerant systems can only tolerate byzantine failures up to a specific threshold. At a certain point, the byzantine processes will be break guarantees provided by the system as stated in the CAP theorem.

The pBFT algorithm, as proposed by the practical byzantine fault tolerance paper in 1999, follows a three stage commit protocol for state machine replication that tolerates at most f byzantine processes with at minimum 3f + 1 processes present to hold availability and consistency guarantees.

The algorithm follows four main steps:

  1. Client invokes a request
  2. The primary receives the request and forwards it to all other replicas
  3. Replicas execute the request and reply to the client
  4. The client waits for f + 1 matching replies from replicas

With the emergence of blockchains, beginning with Bitcoin in 2009, interest in BFT consensus algorithms renewed due to their permissionless properties which require strong byzantine fault tolerance. Since then, an emerging field of research into byzantine fault tolerant consensus algorithms has taken place, much of which is squarely focused on blockchains.