How to Reach Consensus Without Trust?
“Consensus” is a major buzzword in the world of blockchain-based cryptocurrency: it’s the technical foundation of blockchain, it’s the title of the annual crypto convention, and it evokes the philosophy of trust in egalitarian anonymity among enthusiasts.
But cryptocurrency, despite its name recognition and hype, is just one use case of blockchain, and blockchain is a tiny fraction of the amazing world of systems that use consensus to make computers work well together. In fact, consensus-based distributed systems are the reason we can trust global real-time banking, collaborate with others on shared documents, and organize ourselves into a cloud-based economy. The kind of consensus you need depends on the amount of trust you already have.
Consensus from Humans to Computers
Imagine trying to decide where to go for dinner with a group of friends. One friend loves sushi; another is craving pizza; there’s a vegetarian or two; and someone else suggests trying a new taco place. Nobody among the group has the authority to make the decision on their own. Most importantly, the group wants to eat together - so consensus will be required. After a bit of back-and-forth, you all settle on the pizza spot. Consensus reached!
Computers reach consensus in a similar way. In situations where multiple computers need to work together without a single authority telling them what to do, they need to constantly agree on what they’re doing. So, every step of the way, a group of computers (often called nodes in this context) spend a bit of time chatting back and forth before agreeing on a single version of the truth—which restaurant meets everyone’s needs, which piece of data is correct, what transaction to record, or what the next action in a sequence should be.
This agreement is critical in environments where knowledge and actions are spread out across many different nodes. Without consensus, each node in a group trying to work together would do their own thing and potentially undertake redundant or contradictory tasks. But even more importantly, the constant agreement feeds into a control plane that sees the whole picture and can orchestrate the work holistically.
Consensus with No Trust
Computers are like humans in that they need to operate with a basic level of trust in order to reach consensus. But in some contexts, like cryptocurrency, that trust does not exist at all. So, ingenious developers have figured out how to create that trust within the consensus process.
Going back to our metaphor, reaching consensus among untrustworthy nodes would be like choosing a place to eat with suspicious-looking strangers. That is going to take longer because it’s fraught with risk: everyone’s tastes and allergies are unknown; nobody knows exactly how the check will get paid; someone might leave before the check comes, steal the money everyone’s contributed, or even poison your food! Without mutual trust, and without a trusted authority to make unilateral decisions, it would be a real challenge to ensure that everyone has an equal opportunity to eat what they want and only pay their fair share.
Cryptocurrencies face that problem because a large part of their purpose is to counteract the presumed untrustworthiness of actors on each side of a transaction, in the absence of a regulating authority. So, they depend on consensus algorithms that actually incorporate human incentives and behavior into the consensus process through “proof of” some kind of investment or commitment to the process. As a consequence, their consensus processes are doing more work, and they take significantly greater time and resources. Cryptocurrency consensus executes at a rate measurable in the range of minutes per transaction or tens of transactions per minute. For those who believe deeply in the potential of crypto, that is a small price to pay for enabling anonymous nodes to engage in trustworthy interactions without depending on a governing authority.
Consensus With Trusted Nodes
That said, most of the time we can depend on a governing authority, and in most computer science situations the actors themselves are not anonymous - so they can be trusted. Often all of the nodes are owned by the same entity, or shared across entities that trust each other’s intentions and have legal recourse if that trust is broken. In this case, the consensus algorithm has an easier job - back to choosing a restaurant among friends. It’s not risk-free, and the process does take a non-zero amount of time. However, as long as the setting is conducive to reaching agreement, the friends usually agree quickly without any issue and move on with their lives. Consensus in the broader computer science and IT world executes at rates measurable in thousands and tens of thousands of transactions per second.
In general, though, we can’t assume the setting is conducive to reaching a trustworthy agreement. Let’s say multiple friends are texting each other to decide, and one of them has their phone put away because she’s driving while another has poor reception and is experiencing dropped messages. A bad outcome might result, not because the actors were untrustworthy but because the environment was not conducive to a fair and quick process.
In this context, we still need an algorithm to ensure that the correct truth is recorded and acted upon. The major consensus protocols to accomplish this are called Paxos and Raft. These protocols help a group of computers (nodes) to quickly agree on a single data value or action, even if some of the nodes are unreliable or the network is experiencing issues. It includes checks and mechanisms to determine the best path forward assuming an untrustworthy environment.
Consensus with a Trusted Environment
It’s often totally fine to have friends texting each other across unreliable network connections when the process is casual and low-stakes, like choosing a restaurant. That’s what you get when you follow the instructions for a standard Kubernetes or Kafka deployment. But for high-throughput, mission critical enterprise-grade distributed systems, that’s a recipe for disaster. The bottleneck role of consensus means that if it executes slowly, or encounters disruptive circumstances that force it to keep starting over, that has cascading effects on the rest of a tightly interconnected system.
What’s needed in those important enterprise use cases is expert attention to the consensus process and what that process needs in order to thrive. In short, once isolated into a safe, predictable, trustworthy environment, the consensus algorithm runs much more smoothly. It can even be programmed to forgo various checks and processes that are no longer necessary. This makes a huge difference in how distributed systems perform for both enterprise operations and customer-facing cloud products. Consensus in a trustworthy environment can execute at hundreds of thousands or even millions of times per second.
Senior distributed systems engineers and cloud architects know how to create this kind of environment for consensus in each distributed system they implement and maintain. But, that’s an expensive, time-consuming, ongoing labor-intensive process that takes away from revenue-generating innovation. And the fact is, most of those talented engineers work at the ten or twenty leading cloud companies. They’re simply not available to design and maintain enterprise systems in other sectors.
Cachai offers the solution to that problem: a drop-in consensus environment that quickly and easily provides the capability for distributed systems to reach enterprise scale and performance without specialized talent or reliance on the public cloud. If your enterprise is struggling to accomplish an IT transformation with existing talent, Cachai could be the tool your folks need to achieve their vision.