The Burden of Proof(s): Code Merkleization
The Burden of Proof(s): Code Merkleization
As Ethereum inches closer to Serenity and the Eth1/Eth2 merge, the Stateless Ethereum initiative is gaining momentum. The ultimate goal of this ambitious project is to remove the requirement of an Ethereum node to keep a full copy of the updated state trie at all times. Instead, changes of state will rely on a much smaller piece of data that proves a particular transaction is making a valid change. This solves a major problem for Ethereum, one that has so far only been pushed further out by improved client software: state growth.
What is a Witness?
In the Stateless Ethereum paradigm, a witness is a Merkle proof that attests to a state change by providing all of the unchanged intermediate hashes required to arrive at a new valid state root. Witnesses are theoretically a lot smaller than the full Ethereum state, but they are still a lot larger than a block, which needs to propagate to the whole network in just a few seconds. Leaning out the size of witnesses is therefore paramount to getting Stateless Ethereum to minimum-viable-utility.
The Code's Witness
Smart contracts in Ethereum live in the same place that externally-owned accounts do: as leaf nodes in the enormous single-rooted state trie. Contracts are in many ways no different than the externally-owned accounts humans use. They have an address, can submit transactions, and hold a balance of Ether and any other token. But contract accounts are special because they must contain their own program logic (code), or a hash thereof. Another associated Merkle-Patricia Trie, called the storageTrie keeps any variables or persistent state that an active contract uses to go about its business during execution.
Code Merkleization
Code merkleization aims to split up the giant chunk of code that represents a smart contract, and to replace the field codeHash in an Ethereum account with the root of another Merkle Trie, aptly named the codeTrie. This allows for a much more efficient representation of the contract code, as only the required pieces of code need to be included in the witness.
Worth its Weight in Hashes
Let's look at an example from this Ethereum Engineering Group video, which analyzes some methods of code chunking using an ERC20 token contract. Because bytecode is long and unruly, let's use a simple shorthand of replacing four bytes of code (8 hexadecimal characters) with either an . or X character, with the latter representing bytecode required for the execution of a specific function (in the example, the ERC20.transfer() function is used throughout).
Choosing the Right Chunk Size
The choice of chunk size is a critical one, as it will determine the efficiency of the code merkleization scheme. A smaller chunk size will result in more hashes being required to represent the unused code, while a larger chunk size will result in fewer hashes, but may also result in more wasted space.
The Data Speaks
Already we have some promising results, collected using a purpose-built tool developed by Horacio Mijail from Consensys' TeamX research team, which shows overheads as small as 25% -- not bad at all! In short, the data shows that by-and-large smaller chunk sizes are more efficient than larger ones, especially if smaller hashes (8-bytes) are used.
Conclusion
The burden of proof(s) is a critical one in the Stateless Ethereum initiative, and code merkleization is a key component of this effort. By splitting up the giant chunk of code that represents a smart contract, and replacing it with a smaller, more efficient representation, we can make significant strides towards achieving the ultimate goal of Stateless Ethereum. But the choice of chunk size is a critical one, and will determine the efficiency of the code merkleization scheme. As we continue to collect data and refine our understanding of this complex problem, we will be better equipped to make informed decisions about the future of Stateless Ethereum.
Forward-Looking Thoughts
As we move forward with the Stateless Ethereum initiative, it's clear that code merkleization will play a critical role in achieving our goals. But it's not just about the technical details - it's about the real-world implications of this technology. As we continue to push the boundaries of what's possible with blockchain technology, we must also consider the social and economic implications of our work. By doing so, we can create a more equitable and just society, where everyone has access to the benefits of this technology.
Requirements
- MINIMUM 800 words - comprehensive coverage
- Use clear section headings (##) to organize content
- Write in an engaging, journalistic style
- Include technical details but make them accessible
- Provide practical insights and implications
- Use markdown formatting for structure
- NO fluff or filler - every sentence should add value
- Focus on "why this matters" and real-world applications
- Include specific examples where relevant
- End with forward-looking thoughts or implications
Source: https://blog.ethereum.org/en/2020/11/30/the-1x-files-code-merkleization




