This is the first post in a series diving into the concepts, data, and actionable insights of decentralization transparency on the Ethereum 2 Beacon Chain
In 1964, sensational stories broke out about the murder of Catherine Genovese, a bartender from Queens and how 38 bystanders apparently watched and did nothing¹. Details later debunked the inaction of every witness, but the shocking tragedy led social psychologists to research and quantify the diffusion of responsibility among masses which later was popularized as the ‘bystander effect’².
Adopting this narrative, there’s a fascinating talk of “bugs in the brain” by the famed security expert, Peter Gutmann, at Kiwicon 2007 (New Zealand’s hacker conference)³ that goes into the psychology of how the bystander effect is alive and well in open-source development⁴. Decentralization transparency on public blockchains like Ethereum shoulders this narrative as well — as do its infrastructure providers. In fact, the integrity of the system aligns with the goals of its service providers to provide decentralization transparency. You can follow the trail to understand the importance of transparency as the source of actionable insight in public, open-source, decentralized systems. But, for argument’s sake, I’ll connect the dots here in a series to clarify key concepts, identify important data and metrics to track, and specify an important initiative to address decentralization transparency on the Ethereum 2.0 Beacon Chain.
This specific post dives into key concepts that surround the narrative of decentralization transparency on the Ethereum 2.0 Beacon Chain and why it’s important to do something about it. Vanity metrics aside, objective transparency into the state of the system from security and open-source principles is paramount to a robust network. Subsequent posts will dive deeper into:
- Quantifiable metrics we should be tracking
- An initiative for easily revealing decentralization transparency through an open-source, independently hosted dashboard.
The scalability trilemma has ushered forth tremendous, collective effort from the Ethereum developer community to activate the Beacon Chain and kick off the start of rolling, phased Ethereum 2.0 migration. This march thus far has been progressing without any catastrophic hitches. Beyond the first slashing⁵ caused by redundant validator instances, we’ve seen a handful of violations (only 11 thus far⁶) with an impressive ~77k validators (both active⁷ and pending⁸) contributing to the network. The launch of a system is never perfect, but robust validator infrastructure is always vital to the health of the network.
This series of posts attempts to track a sequence of steps that leads from questions on one concept to the next. “The First Slash, A Retrospective”⁵ presented us with an important question about the robustness and quality of infrastructure setups across the network. Validator service provider infrastructure quality was an interesting data point to identify as they can potentially have a non-trivial share of the network. This led to wondering how many validator service providers there are, how much of the Beacon Chain they actually maintain, and what practical actions can be taken to achieve reasonable decentralization. This inevitably leads to questions about decentralization and what decentralization rightfully means in this context. Finally, this brings us to specific actionable insights that would necessitate additional tools and dashboards to provide clear transparency into the decentralization of the Beacon Chain network.
Robust infrastructure is undeniably a key pillar in securing the progression of the Ethereum 2.0 network. Let’s zoom out and take a look at the overall landscape and paths worth investigating and dive into the concepts of what a validator on Ethereum 2.0 is, what decentralization transparency exactly is, the scalability trilemma, and the tragedy of the open-source commons with regards to the bystander effect.
Validator? Validator Service Provider? What’s the Difference?
For clarification and disambiguation, some of the generalized terms used to describe validators in Ethereum 2.0 are slightly different in perception than terms most people are familiar with in other Proof of Stake (PoS) networks. On most other PoS networks, specific entities can represent uncapped delegated amounts of a staked token and have been referred to primarily as validators. On Ethereum 2.0’s Beacon Chain, each validator’s effective balance is 32 ETH. This means that, though you can theoretically add an unlimited amount of ETH to your validator, the amount “in play” for rewards and penalties is capped at an effective 32 ETH amount⁹. Therefore, by tracking particular metrics of the Beacon Chain, we can measure the number of total validators generally represented by deposits of 32 ETH increments. This effective 32 ETH cap provides a lower barrier of entry for participants compared to the initial 1,500 ETH cap¹⁰ that was suggested in favor of increased decentralization. This is great. Decentralization is a key component of the scalability trilemma the Ethereum 2.0 development roadmap is looking to solve. But are all of these validators independent entities? No, and there were never claims to that end. However, as most things go, narratives can have a mind of their own.
What do we mean by that? As some have noted¹¹, clarification on “what is what” is quite important in understanding how total numbers are interpreted in context of the decentralization narrative — specifically in context to existing naming conventions the PoS ecosystem has widely adopted.
Generally, the term “validator” has been used to refer to service providers like us (stakefish) that provide staking services on behalf of a PoS network’s stakers with uncapped amounts. In the Ethereum 2.0 context, stakefish is indeed a validator but it also provides non-custodial services allowing Ethereum participants with 32 ETH (or more) to spin up a validator through stakefish’s infrastructure while still maintaining their withdrawal keys. Do we count all validators spun up via a service provider like stakefish as the collective sum of its parts or as distinctly separate entities?
Here’s a quick graphic that roughly breaks down and disambiguates the players/entities:
Decentralization. Transparency. Decentralization Transparency?
Let’s take a step back and remind ourselves: why? Why is decentralization important? Why is transparency important? Why is decentralization transparency important to the network’s security and robustness? These should clarify important milestones and serve as guideposts on our development path through this hopefully (not so dark¹²) forest of Ethereum 2.0’s upgrade cycle.
In part 3 of the Consensys Research Interoperability Series, “The Importance of Decentralization”¹³, we are presented with a great framework for quantifying and measuring decentralization into 4 categories and broken down into 19 subcategories. I highly recommend the research and analysis there for insights into the categories that include protocol-level decentralization, node counts, ecosystem metrics, and token impacts. The report concludes that the continued development, adoption, and value accrual in Ethereum has contributed significantly to decentralization over time.
This continued growth and upgrade schedule for Ethereum 2.0 is pushing forward a solution to the infamous scalability trilemma¹⁴ (decentralization, scalability, security). Many years in the making, careful and thought out design choices¹⁵ have allowed us to reach Beacon Chain activation and progress towards subsequent phases like sharding¹⁶ and the docking¹⁷ of mainnet into the Beacon Chain. Design solutions for the scalability trilemma have thus been non-trivial. Of the trilemma, we have seen scalability attempts of all forms and in many flavors.
However, decentralization and security are two key properties that I believe are fundamentally and ideologically inherent to the robustness of trust and transparency in Ethereum 2.0. From a transparency perspective — perhaps specifically to what many developers and active DeFi traders experience daily from one of privacy a la the “Dark Forest” of Ethereum — the network may represent somewhat of an adversarial environment.
Transparency, or the fear of it in this supposed adversarial environment, oftentimes gets misunderstood with safety and security principles. It is important to not exclusively equate privacy or the lack of transparency with security, they are related but only distinctly. Here, transparency in favor of privacy is paramount to identifying the actual state of decentralization.
Unlike other PoS chains that provide built-in¹⁸ validator data labeling, Ethereum 2.0 does not have dedicated on-chain properties for deeper validator identification. As clarified above, each Ethereum 2.0 validator, practically speaking, stands as a 32 ETH denominated participant. These participants can range from individuals at home with their own self-hosted machines to large scale service providers with significant resources and sophisticated setups.
A purposeful omission of deeper validator identification can provide some semblance of anonymity. This may be important for smaller, independent parties because it may limit the surface area for attack vectors. Many independent validator participants won’t necessarily have the hands-on resources to readily respond to attack vectors such as brute force (DDOS) or any level of sophisticated attack scheme.
However, this is a matter that must be addressed by well-resourced validator service providers. Security through obscurity (and privacy) may be a well-intentioned first line of defense, but from a robust security engineering perspective, a reliance on security through obscurity has been widely rejected.
Borrowing from the field of cryptography, Kerckhoff’s principle¹⁹ is an example of an important and fundamental look into security design (and perhaps to an extent a strong proponent of transparency over certain privacies). The principle states:
A cryptosystem should be secure even if everything about the system, except the key, is public knowledge.
Ross Anderson²⁰, the well known security researcher, puts Kerckhoff’s principle another way:
The security of a system should depend on its key, not on its design remaining obscure.
Claude Shannon²¹ simply puts it as:
The enemy knows the system
and that “one ought to design systems under the assumption that the enemy will immediately gain full familiarity with them”. This should be particularly meaningful given we are operating on public and (expectedly) transparent systems.
Privacy affords security through obscurity at the expense of decentralization transparency. From a security engineering perspective, we’ve seen that security through obscurity presents a flimsy case against exposing certain information in favor of transparency. The information in this case is specifically that of Ethereum 2.0 validator service providers. These insights into validator service provider information benefits the system’s transparency into its state of decentralization. The security arguments of upholding privacy at all costs can therefore be relegated in favor of transparency, particularly for large service providers while perhaps used sparingly for individual participants. The dangers of exposure for independent participants are also not novel. There have been plenty of independently run validators that have existed with robust uptime guarantees on other Proof of Stake networks with no catastrophic issues from exposure and transparency. The solutions to these attack vectors are also well documented and in certain specific cases recommended in specific Proof of Stake networks. Cosmos Hub’s Sentry Node²² solution illustrates a mitigation tactic since validator nodes have fixed IP addresses with open ports exposed to the internet — there are other also common solutions that involve flexibility, scalability, and routing²³.
Finally, it is important to note that the goals of decentralization transparency is not a complete eschewment of privacy. As we’ve seen, privacy can have its place as a limited and superficial stopgap measure to protect independent validators. However, only relying on this measure fails in its robustness to secure a system or infrastructure setup. It has also been relatively trivial to target and expose²⁴ independent validators as Jonny Rhea had demonstrated²⁵ in June 2020 on the Witti Testnet — and even more so through customized deposit contracts which many validator service providers utilize. Admittedly this example of validator snooping on the testnet provides a significantly scaled down picture here, but nonetheless it provides an example of how a motivated actor was able to upend the privacy of such validators.
In a recent post on secure Ethereum 2.0 infrastructure setup, the Blockdaemon team does a fantastic job thoroughly describing the issues with privacy guarantees. They demonstrate specifically that “it is relatively trivial to correlate validators to the IP address of the beacon node they attach to”.
Although the validators themselves should not be exposed to the internet, their beacon nodes are, and with this knowledge, it becomes feasible to selectively DoS the beacon node that owns a validator that’s about to propose a new block. This means a malicious attacker can prevent your validator from doing its duties by overloading the beacon node it’s attached to for about 12 seconds every 6 minutes when your validator is due to make an attestation or block proposal. — Blockdaemon²⁶
Ultimately we see that privacy can afford certain measures of security, but that it lacks appropriate robustness from our comparison to certain security engineering principles of cryptosystems. What we rather want in the perspective of security is the resilience that can be achieved from sufficiently large and decentralized validator sets.
Moving past the layer of privacy and external exposure risks towards security, the well-established understanding of security on Proof of Stake based networks relies on the size and decentralization of the validator set. A diverse set of validator pools provides more guarantees for decentralization and (hopefully) more collusion resistance²⁷. The design rationales for lowering the barriers to validator entry are thus important for contributing to decentralization and the robustness guarantees gained from it. Validating on the Ethereum 2.0 Beacon Chain literally allows for a “typical consumer laptop … to process/validate … shards (including any system level validation such as the beacon chain)” or that “no participant will be required to have big-iron” in order to be a full participant of the system²⁷.
Given the security provisions and importance of decentralization, we should also be able to verify it. Decentralization transparency is paramount in ensuring the robustness and security of the Ethereum 2.0 Beacon Chain and beyond.
The Tragedy of the Open Source Commons: The Bystander Effect and Decentralization Transparency.
Sovereignty necessitates understanding. If you don’t understand a system you’re using, you don’t control it. If nobody understands the system, the system is in control.— Philip Monk, Precepts²⁸
Decentralization transparency is only valuable if you can understand its state. Making source code public and open-source doesn’t necessarily mean that it will be robust and audited⁴. We’ve seen plenty examples in the open-source community that relied on these assumptions (PGP XORytes bug²⁹, GPG’s XOR bytes memorial bug³⁰, Linux /dev/random flaw³¹).
These are just a few examples of fundamental open-source systems that in many cases can be a source of failure and exposure to a system. The phrase, ‘many eyes makes bugs shallow’ did not live up to its promise here. As Peter Gutmann put it, “it only works if there are people actually interested in finding the bugs and with security code [unfortunately still] usually the bystander effect [still] applies — everyone assumes that someone is looking at it”⁴. The burden falls on no one and yet everyone.
In his 2007 Kiwicon talk, Gutmann argues that the bystander effect in open-source essentially boiled down to the ‘bugs in the brain’. Human minds work differently. Developer minds work differently and thus make particular assumptions of their users. Usually these assumptions are of internal biases towards responsibility and expectations. If you were to use a particular open-sourced software, check for yourself if it does what it says it does. It’s open-source, so rationally, you should check it. Don’t mind the legal shrinkwrap agreements that users skim over, check the code for yourself. Sound familiar?
What to do.
In public, transparent, and (expectedly) decentralized systems, we have popularized and espoused mottos that have asked developers and users alike to shoulder the responsibility and burden of insuring themselves. Self-sovereignty is an incredibly powerful and vital property of these decentralized systems. Yet how often do we practice what we preach? Independently, we all shoulder the responsibility for verification ourselves. As an ecosystem supporting the system, the burden falls on everyone, yet also no one.
Ethereum operates on a fully auditable, public chain. The transparency that affords this is powerful but it must also be fully utilized to be meaningful. If no one cares to verify it themselves — specifically in the absence of some overarching regulatory power that enforces for the public good — then it may just as well be private. The lack of understanding and transparency into a public open-source system could even be considered dangerous (dark forest territory). Open-source gives some semblance of plausible deniability for the system’s designers or developers who build public, decentralized applications. The liability shifts to its users but oftentimes its users do not have the ability or technical acumen to assess for themselves. As the moniker goes: “don’t trust; verify”. But what if they cannot ‘verify’?
Decentralization transparency must be made easy to verify and understandable to developers and users alike. There needs to be a way to provide actionable visibility to the system’s state. Transparency builds trust and confidence in a system and this trust can be earned through credibly neutral clarity into it.
An open-source, independently-hosted, and credibly neutral dashboard for clear transparency into Eth2 Beacon Chain decentralization should exist. Hopefully, this can ease the burden of the commons by providing more avenues of clarity and transparency for those who cannot do so themselves.
In a following post, I will be going into the types of data and analytics that can clearly provide understandable insight into Beacon Chain data. In a final post, I will lead into the initiative of creating this dashboard as well. Stay tuned!
stakefish is the leading validator for Proof of Stake blockchains. With support for 10+ networks, our mission is to secure and contribute to this exciting new ecosystem while enabling our users to stake with confidence. Because our nodes and our team are globally distributed, we are able to maintain 24-hour coverage.