Challenging Merkle Trees
In previous articles I identified several challenges of current public crypto-networks, such as Bitcoin and Ethereum, that make them difficult to scale. These challenges include their two-tier de-centralized design, their leader-based consensus mechanism, their smart contract inefficiencies, their application model, and several other design choices.
In this article I address the method used by most current crypto-networks to store blockchain information. This includes all the transactions in the blockchain history and all other information that is stored on behalf of blockchain applications.
When storing blockchain information, the basic objectives are data persistence and integrity. We cannot achieve the further blockchain objectives of security and immutability if, as we are trying to retrieve the data, we find that our information was corrupted. That is, the data may have been lost, an error may have occurred, or if we may not retrieve exactly the same information we stored.
The challenges of information corruption within a particular computer system (or node of a crypto-network) can be categorized into intentional attacks or system errors. Merkle trees (or tries) are good at discovering information corruption, but not very good at maintaining data integrity. Database solutions are better at maintaining data integrity within a system.
In both cases, consensus protocols compensate for the possible shortcomings of one node by agreeing on what data is replicated on the blockchain, on all crypto-network nodes, providing us with ultimate data integrity: If one node has the wrong information, everyone can see it.
In current crypto-networks, blockchain information is organized in Merkle tree structures. This is an organization of data based on hashes. These structures are then stored on disk.
While the original Bitcoin code used simple files to store blocks (see figure below), current implementations eventually store blockchain information in a database.
Figure 1: A few instructions from the original Bitcoin code
In current crypto-networks, the solution of storing Merkle tree information into database is another factor contributing to lack of scalability, as it multiplies the number of I/O operations required to read and update blockchain information.
Reading a key (e.g. the balance of a user account) implies up to 64 I/O operations, in addition to the amplification introduced by the DB itself3.
For example, the state tree, one of the Ethereum Merkle structures holding user account data, contains millions of nodes, most of which take the form of a single hash referencing up to 16 other hashes.
This is a very inefficient way to store data, because nodes are alien to the structures used by databases. These nodes are random hashes referencing more random hashes. Database models cannot easily represent and optimize the handling of such structures.
Such I/O access happens in Ethereum two hundred times a second, when adding and modifying data in the Merkle tree. Ethereum supports less than 15 transactions per second. Scaling by a factor of one thousand would require about 200,000 access operations, or an unworkable 20 million database operations per second, in each blockchain device.
Why are Merkle trees used?
This is probably the most important question in our analysis.
The idea is that a client requesting information from the blockchain, can receive not only the information, but a proof that the information is correct. Since the root of the blockchain is public, anyone can access its hashes and verify the information by verifying the proof that came with it.
The Merkle structures used by current cryptonetworks try to solve two different problems: Securely storing data, and securely delivering a receipt. However, as we have noted, this is not done efficiently.
We know how to solve each problem separately: Communicating data securely can be done through encryption techniques. Verification of data can be done through data lookup in a trusted copy of the blockchain.
The design choice of a combined solution was, at least partially, due to the decentralized (and not completely distributed) design of current crypto-networks (See Figures 2 and 3).
Current crypto-networks assume that the information is used by a client device that is physically separated from the blockchain database (B).
Currently this is true, because the gate-keepers of the actual stored blockchain are intermediaries, such as miners and verifiers.
(Note that Figures 2 and 3 show only crypto-network nodes and not internet nodes).
When clients are separated from the blockchain content, a number of challenges arise:
- the identification of the client to the intermediary, and vice-versa,
- the secure communication of the means of verification, such as passwords, and identification information (i.e., Are we talking to the right network?),
- the user has to understand key management and accept the risk of possible key loss,
- the user keys may come under attack, as they need to be used for encrypting a message in a non-secure environment,
- the user has to have an account with some intermediary that can maintain a replica of the blockchain (miner, verifier, Exchange).
If the client is separated from the server, it cannot necessarily trust that its communication with the server is authentic, or that a transaction has been successfully completed, or that the result of the operation has been securely transmitted. Thus, the ability to check the received Merkle proof against the blockchain is indeed useful.
But now we realize that we need to add another challenge to the list: To provide a client with a proof that the requested data is correct, Merkle tree implementations introduce two orders of magnitude amplification of I/O operations (i.e., a 10,000% loss of efficiency).
How can this be acceptable? After so many years of database design, we know how to store and retrieve all types of information efficiently. Why design something that is specific to a particular ledger application? How expensive can it be to produce a receipt? Instead of tackling these questions directly, let’s step back and think.
A better mousetrap
The best mousetrap is often the simplest. Similarly architectural simplicity, however rare and difficult to achieve, often provides the best results.
New generation stochastic crypto-networks challenge the assumption of a necessary separation between a client and the blockchain data. In stochastic networks, each user device is a node of the network. Each device maintains a replica of the blockchain, participates to the verification of transactions and plays an active role in the consensus process.
When using this simpler architecture, the challenges listed above disappear. Those challenges were due to the a two-tier (decentralized) design of the crypto-network.
When each network node is a peer and has no need of intermediaries, then each node has the privilege and power of having the blockchain data (B) available in its own device.
Users are not clients any more.
They are the owners and the verifiers of the information.
Under this scenario, new-generation crypto-networks have no need for Merkle trees and their inefficiencies. Each user device owns the information already, and needs no further proof.
The secure communication, replication and storage of the information are assured by the crypto-network consensus and block equalization processes that maintain the blockchain on each device. User devices do not need to ask anyone else if a transaction has been completed successfully: they are in charge of block verification.
What devices need, is an efficient way to access their blockchain replica to add, update and retrieve information. Within one device, these are common database functions. Each device then can use the most efficient and reliable data storage techniques available.
A simpler approach to storing information will also simplify the retrieving and verification of blockchain information performed by exchanges and by blockchain applications, improving the overall efficiency of the system.
Designing a bank ledger
Bitcoin and Ethereum used the powerful blockchain concept in the design of a bank ledger.
Ethereum initially organized its Merkle Trees into three sets: to store transactions, accounts (or state tree), and receipts. We can see that their design was specific and limited to one type of application: a currency and/or tokens.
Ethereum went on to support smart contracts, and then applications based on smart contracts. For storing information for these applications, they again used a Merkle structure.
But there are requirements for supporting a much broader range of distributed applications.
The use of Merkle trees, a choice implied by other design decisions, does not even meet the requirements of first generation, decentralized, crypto-networks.
Designing with a vision of the future
The design of new-generation stochastic crypto-networks has a longer term vision: the support of any distributed application, exchanging any amount of data.
Any design should start by analyzing current and future requirements. Then the best structures to support the final objective can be selected. Supporting specific cases, such as a cryptocurrency application, is a secondary task.
When applied to everything that needs to be stored, as in Ethereum, Merkle trees become unbearably inefficient and are one of the factors contributing to first generation network’s lack of scalability.
Although new research indicates that Merkle tree performance can be improved, second generation crypto-networks have a better option: they will will be able to store blockchain information in the secure environment of each user device in more direct and efficient ways that do not require the use of Merkle trees.
 See my article at: https://medium.com/@gori70/blockchain-de-centralization-is-the-problem-79984bbf24f5
 Some Ethereum versions use the rocksDB, others use the Geth-LevelDB databases.
 RAJU, P., KADEKODI, R., CHIDAMBARAM, V., AND ABRAHAM, I. PebblesDB: Building Key-Value Stores Using Fragmented Log-Structured Merge Trees. In Proceedings of the 26th Symposium on Operating Systems Principles (New York, NY, USA, 2017), SOSP ’17, ACM, pp. 497–514.
 These nodes represent linked structures in programming terms, not nodes of a computer network.
 Accessing blockchain information is a separate problem from securely linking blocks through their hashes, or securely completing transactions through encryption techniques and digital signatures.