What is Blockchain State?
State refers to the information about all accounts within the blockchain, including details about the accounts themselves, their balances, and contract codes. When a transaction occurs, it inevitably affects a particular state.
For example, if person A transfers tokens to person B, the balances of both A and B need to be updated. This is what it means for the state to change.
Components of State
- Accounts: Addresses and their associated data
- Balances: Token holdings for each account
- Nonces: Transaction counters preventing replay attacks
- Contract bytecode: The compiled smart contract code
- Contract storage: Variables and data stored by contracts
Key Insight
Even transactions that merely alter the state (not create new accounts) leave a transaction record in the blockchain's history. This "historical state" means all on-chain transactions contribute to state growth.
A Helpful Analogy
Back in 2010, when Facebook was growing rapidly, it stored over 260 billion images, amounting to over 20 petabytes of data, and continued adding 60 TB of data every week. To manage this explosive growth, Facebook built Haystack - a system that minimized metadata requirements and allowed lookups to occur directly in main memory.
Blockchain state management methods follow a similar philosophy: minimize what needs to be stored, and optimize access patterns for frequently-used data.
The Problems of State Growth
As long as a blockchain is used - with accounts transacting and contracts being created - the state will continue to grow. This creates several problems:
1. Increased Node Operation Costs
Full nodes must store the entire blockchain state. As state grows, storage costs increase and hardware requirements escalate. This makes running nodes more expensive, which could lead to centralization as fewer people can afford to participate.
2. Decreased Blockchain Performance
A larger state means more time for nodes to process and verify transactions. Whenever state-changing transactions occur, nodes need to read and update relevant values. As state grows, there's more data to access and more values to change, ultimately resulting in slower performance.
3. Node Synchronization Issues
New nodes must download the entire ledger to participate in the network. Chains take "snapshots" of the record at specific points, which new nodes use to synchronize. If the state is too large:
- Taking snapshots takes longer
- During snapshot creation, new transactions keep adding data
- This discrepancy makes synchronization difficult
- Nodes that fall behind face significant time and cost to catch up
State Bloat
The problem of the state becoming too large is called state bloat. If transaction throughput increases without database improvements, state bloats further, preventing the benefits of higher throughput from being realized.
Heaviest Contributors to Ethereum State
Data from Paradigm shows that ERC-20 and ERC-721 tokens are the heaviest contributors to Ethereum's state. Each token contract stores balances for potentially millions of holders, creating enormous state footprints.
Fast Chains & Accelerated State Growth
Fast blockchains face a unique challenge: the faster you process transactions, the faster state grows. If a chain like Sei processes more transactions in a given time, its state grows much more rapidly than slower chains.
The Parallel Execution Paradox
Adding parallel execution makes this worse. If you execute transactions in parallel without database improvements, the state bloats even faster, causing the problems mentioned above. These issues ultimately prevent the benefits of parallel execution from being realized.
This is why high-performance chains like Monad, Sei, and Fuel have invested heavily in custom database solutions - they recognized this challenge from the beginning.
Ethereum: Verkle Trees & Statelessness
Ethereum uses Merkle Patricia Tries (MPT) to store data such as accounts, smart contracts, transactions, and receipts. The tree structure visually resembles an inverted tree with a single root at the top and branches leading down to the leaves.
The Witness Problem
MPTs effectively store large amounts of data and create a proof (called a "witness") that verifies it all. However, as the tree grows, witness sizes grow too. Current witness sizes can range between 18-47 MB in worst-case scenarios.
Why does this matter? A witness needs to be transferred between validators fast enough to be received and processed within the block time (12 seconds). Larger witnesses slow down transfers and increase verification times.
Verkle Trees (EIP-6800)
Ethereum is working on Verkle trees as an alternative data structure. Verkle tree witnesses are significantly smaller because:
- Smaller hierarchy of intermediate nodes
- Reduced distance between leaf nodes and root node
- Pedersen commitments generate more compact proofs
These upgrades help manage state growth and enable faster, more cost-efficient state access.
Node Pruning
To avoid constantly running out of disk space, Ethereum clients like Geth enable node pruning. This uses a snapshot of the state to decide which parts are stale and prunes them to make the database more compact.
Statelessness Vision
The ultimate goal is "stateless clients" - nodes that don't need to store the entire state to validate blocks. Instead, each block would include the witnesses needed to verify it. This dramatically reduces node requirements.
Monad: MonadDB
Given that Monad executes transactions in parallel, it requires a database that supports multiple simultaneous read and write operations. Ethereum's LevelDB and RocksDB don't natively support asynchronous I/O.
The Problem with Traditional DBs
If Ethereum optimistically executed transactions in parallel, its synchronous database operations would be a bottleneck. Every read/write operation would block, negating the benefits of parallel execution.
MonadDB Solution
MonadDB is purpose-built for parallelized execution:
- Patricia Trie on disk and memory: More efficient updates and verification than Ethereum's MPT
- Async I/O via io_uring: Linux's latest kernel support enables non-blocking operations
- Reduced kernel contention: Traditional DBs open kernels for memory, threads, and synchronization. Parallel execution would increase overhead. io_uring bypasses this.
Why io_uring Matters
When RocksDB performs read/write operations, it opens kernel processes to manage memory and threads. Executing transactions in parallel would open even more kernels, causing CPU contention. io_uring allows multiple read/write operations to occur simultaneously without this overhead.
Sei: SeiDB & Modular Storage
SeiDB takes a modular approach to state storage, dividing it into two layers optimized for different access patterns.
Dual-Layer Architecture
State Commitment (SC) Layer
Manages active or "warm" state that is frequently accessed. Stored in Memory-Mapped IAVL Tree (MemIAVL) to optimize access, enabling faster reads and writes.
State Storage (SS) Layer
Stores historic or "cold" state data in DBs such as PebbleDB, RocksDB, or SQLite. Validators can choose based on their requirements.
Asynchronous Pruning
Sei asynchronously prunes state data, removing stale information without blocking transaction processing. This keeps the active state lean while maintaining historical data availability.
Fuel: State Rehydration & Predicates
Fuel has the most innovative state growth management methods. Unlike Ethereum, Monad, and Solana that use an account-based model, Fuel uses the UTXO model (like Bitcoin).
UTXO Advantage
Unlike accounts that hold balances and internal contract logic, UTXOs are independently trackable units of state. This simplifies the data structure, focusing on lean data that minimizes state growth.
Three Primary Methods
1. Native Token Standards
Ethereum uses token standards like ERC-20 implemented as layered smart contracts. Fuel integrates assets directly into the core protocol as native elements.
- Eliminates additional state footprint from external contracts
- Transferring an asset affects only one database key-value pair
- Eliminates state changes from approval/transferFrom functions
2. State Rehydration
Instead of storing the entire state on-chain, developers can decompose smart contract states into smaller segments and store minimal records or root hashes.
- Each smart contract relies on localized state trees
- State elements are "rehydrated" from external sources when needed
- More efficient than storing everything on-chain
3. Predicates & Scripts
Transaction authorization and execution use stateless mechanisms:
- Predicates: Authorize on-chain actions without accessing global state
- Scripts: On-chain logic embedded within transactions, discarded after execution
- Instead of storing code on-chain, a hash generates the address
- Full bytecode included in transactions, rehydrating state as needed
Fuel's Philosophy
Fuel's approach is fundamentally different: minimize what must be stored permanently, and provide mechanisms to reconstruct state when needed. This trades some complexity for dramatically reduced state growth.
Economic Approaches to State Management
Beyond technical solutions, some blockchains use economic mechanisms to incentivize optimal state management. The core idea: charge users for storage rather than placing the cost on validators and future users.
State Rent Concept
State rent charges users for storage while they transact. This:
- Discourages unnecessary state creation
- Pushes developers to use state efficiently
- Encourages cleanup of unused accounts/data
- Shifts storage costs to those who benefit from storage
UX Challenges
Pure state rent models have UX issues. Solana initially had state rent where memory would be evicted when account balance went to zero. This led to complexity and user confusion, causing them to move away from this model.
Solana's State Compression
After moving away from pure state rent, Solana has reintroduced state management through several innovative approaches.
Lightweight Simple Rent (LSR)
LSR implements a bonding curve for rent rates where the rent price increases as state size approaches hardware limits.
- Discourages state bloat through economics
- Pushes developers to use state efficiently
- Encourages discarding unused accounts
Hot Account Management
Frequently accessed "hot accounts" must burn a portion of their rent balance to remain in the cache. This ensures Lamports (Solana units) are allocated to accounts that are actively used.
Chilly: Runtime Cache Management
Chilly implements Least Recently Used (LRU) cache for account data:
- Frequently accessed accounts remain in memory
- Less frequently used accounts move to disk
- Determines when accounts are "cold" and should leave memory
- Maintains optimal balance between RAM and disk usage
- Uses "load_limit" to prioritize transactions by memory usage
State Compression (Avocado)
Solana's compression plan has two parts:
State Compression
Account data is compressed by replacing it with a hash. Anatoly (Solana founder) noted that over 75% of accounts haven't been accessed in six months. Compressing them could reduce snapshot size by 50%.
State can be decompressed when required, similar to loading a program. Decompression costs the same as setting up a new account.
Index Compression
Uses a binary tree structure to store accounts, with incentives for validators to participate in state compression.
Sui & Aptos: Storage Funds
Sui and Aptos take similar approaches with upfront storage fees and rebate mechanisms.
Sui's Storage Fund
On Sui, users pay upfront fees for both computation and storage:
- Storage fees go into a storage fund
- SUI in the fund is used as stake, earning rewards
- Rewards split between current validators and reinvestment
- Future validators are rewarded for carrying past state weight
- Principal SUI remains intact
- Deleting data provides partial refund
What Can Be Deleted?
Metadata, event data (auctions, tickets) can be deleted. Transaction history data remains intact.
Sui Pruning Policies
- Aggressive Pruning: Remove old data ASAP, minimal disk usage
- Epoch-based Pruning: Retain data for specified epochs before pruning
For pruned data, Sui provides fallback retrieval from a remote key-value store managed by Mysten Labs.
Aptos Storage Deposit
Similar to Sui, Aptos separately charges for storage along with execution:
- Data can be deleted with full refund (declining rates likely coming)
- Exploring ephemeral storage with time-to-live (TTL)
- Resources automatically deleted after expiration
Jellyfish Merkle Tree (JMT)
Aptos uses JMT, a version of Sparse Merkle Tree optimized for parallel execution:
- Modified leaf node structures for lower I/O overhead
- Better data structure for computational efficiency
- Layered storage: warm state in performant memory, cold state in archive
Economic + Technical
The most effective approaches combine economic incentives (storage fees, deletion rebates) with technical optimizations (efficient trees, tiered storage). Neither alone fully solves state growth.