A Guide To Ethereum Sharding

Our current post is part 3 of our Ethereum series, in case you have missed our earlier post, please go through Ethereum Roadmap Explained and How Could Plasma Help Ethereum Scale Up?.

Ethereum community is undergoing an overhaul while there focus on decentralization and security, they are keen to scale up the Ethereum network to practical levels, where all DAPPS have a smooth sail. Announced by  Ethereum’s co-founder Vitalik Buterin in April 2018, Sharding is a method of increasing the number of transactions that a blockchain can process helping in maintaining gas price and reducing transaction confirmation time. Let’s explore in detail what Sharding is? When is Sharding being implemented? And How Ethereum Sharding works?

What is Sharding?

Sharding is not a new concept and existed even before Blockchain came into existence and applied to any database. As serachcloudcomputing defines –  Sharding is a type of horizontal database partitioning that separates very large databases into smaller, faster, more easily managed parts called data shards. The word shard means a small part of a whole.

Yes, its Horizontal Partitioning and not vertical partitioning. Let’s pick an example to know it better –

TABLE 1

Name Age Occupation City
Alex 32 Employed London
Appy 32 Freelancer Singapore
Chole 43 Retired Berlin
Ajay 43 Business Chennai

In case we partition the table vertically[splitting into two] it results in

Name Age
Alex 32
Appy 32
Chole 43
Ajay 43

 

Occupation City
Employed London
Freelancer Singapore
Retired Berlin
Business Chennai

And in case we split horizontally

Name Age Occupation City
Alex 32 Employed London
Appy 32 Freelancer Singapore

 

Name Age Occupation City
Chole 43 Retired Berlin
Ajay 43 Business Chennai

So on splitting the tables vertically, it creates two new tables that need more space when compared to splitting horizontally by grouping similar one together.

Well, that looks like a simple partitioning of the database? Now imagine this partitioning happening on a decentralized, peer-to-peer network that gets constant updates globally. Hence, Ethereum Sharding differs a bit…let’s dive deep to know more.

What is Ethereum Sharding?

As the man, Vitalik Buterin himself quotes –

Imagine that Ethereum has been split into thousands of islands. Each island can do its own thing. Each of the islands has its own unique features and everyone belonging on that island, i.e., the accounts, can interact with each other AND they can freely indulge in all its features. If they want to contact other islands, they will have to use some sort of protocol.”

Blockchain that is a distributed, decentralized peer-to-peer network consists of a series of nodes. At any given point each node of a blockchain network like Bitcoin or Ethereum stores all states of the network and processes all of the transactions, that helps in high-level security but legitimate scaling issues.

Nodes on such network stores and processes every information/transaction, leading to network straining. With shards in place, one could group similar sets of nodes into one shards that helps in faster processing of the transaction for that specific shard. Basically, you are allowing the system to run multiple transactions parallelly, thereby increasing throughput.

A laymen example could be a “SCHOOL,” where although all students [as nodes] are under same network[ABC School] they are governed by section and class they belong, which have their own set of rules and protocols.

Technically, Ethereum Sharding would be splitting the state [a set of information that represents the “current state” of a system] and history[an ordered list of all transactions that have taken place since genesis] up into

K= O(n/c) partitions

referred to as shards

Where K = shard,
n= size of the ecosystem[in abstract form] assuming that transaction load, state size, and the market cap of a cryptocurrency are all proportional to n.
c= size of computational resources, and
O= Big O notation characterizes functions according to their growth rates.

Picking an example, on Ethereum network a sharding scheme

Shard One All addresses starting with 0x00
Shard Two All addresses starting with 0x01
Shard Three All addresses starting with 0x02

Please note that each shard would have its own transaction history and also the effect of transactions in a shard ‘k’ would be limited to the state of shard ‘k.’

For example, if you have a multi-asset blockchain, the sharding scheme could be where each shard stores the processes and balances of the transaction associated with one specific asset. Making it more complicated, could be the ability of one shard to trigger events on another shard referring as cross-shard communication capability.

How Would Sharding Work in Ethereum?

Ethereum Sharding
Ethereum Sharding

Before diving into “How” let’s get familiar with the terminologies –

Sharding 3
IMAGE COURTESY – MEDIUM @ICEBEARHWW

State

At any given point in time, the complete set of information, i.e., current balances, nonces, smart contract code, transactions, etc. that describes a system. Each transaction initiated could bring the network into a new state.

Transaction

A set of instructions/actions issued by the individual to change the state of the system.

Merkle Tree

Data structure concept that helps in organizing large data sets via cryptographic hashes.

Receipt

A result/produce of a transaction that is kept in a Merkle tree but not stored in the state of a system. For example, Smart contracts logs in Ethereum are kept as receipts in Merkle Trees.

Collation

A shard specific block. Just imagine that the transactions are wrapped in a “collation”; similar to block.

Proposers

The nodes that accept on shard k, that could depend on protocol whether proposers choose which K or are randomly assigned some k.

Collation Header

A collation has a collation header, a short message of the form “This is a collation of blobs on shard k, the parent collation is 0x7f1e74, and the Merkle root of the blobs is 0x3f98ea.”

Prolators

The agents that could act as both a proposer and collator may be referred to as prolators.

Main Chain

The main chain exists with a role limited to storing collation headers for all shards.

For a complete list of terminology, please refer here.

Sharding Roadmap

Sharding would partition all network computational resources into shards, so that a node doesn’t have to process every transaction in the history of the blockchain, in order to make a new transaction or otherwise participate in securing and using Ethereum; instead a node can just participate in a single shard, or more if it so chooses.

Here are the different phases that would help Sharding implemented on Ethereum network. However, this is still under research and subject to change.

  1. Phase 1: Basic sharding without EVM[ Ethereum Virtual Machine] – A smart contract will exist on the main chain, called the Validator manager contract, which manages how data and transactions in shards are accepted as valid by the main chain, via notaries voting on the validity and data availability of collations (collections of data and transactions, analogous to blocks, but where they occur more frequently than blocks) in shards, and proposers proposing blobs (analogous to transactions but without execution in phase 1) which are collected into collations.
  2. Phase 2: EVM state transition function – This phase would have
    1. Full nodes only, e., fully download every collation of every shard, as well as the main chain, fully verifying everything.
    2. Asynchronous cross-contract calls only
    3. Account abstraction – In Ethereum there are two types of contracts – external accounts [that are controlled by private keys] and Contract account [ that are controlled by code deployed on the blockchain]. Account Abstraction would be a process to make the two account similar and also modify the logic controlling external accounts to make them more flexible. So with account abstraction, all accounts are contracts. Or forwarding contracts, where they receive an incoming message, they perform some action [ signature and nounce checks] and forward it to intended recipients with the specified data.
    4. Ethereum flavored WebAssembly (eWASM)- While a WebAssembly (or Wasm as a contraction) is a new, portable, size- and load-time-efficient format. WebAssembly aims to execute at native speed by taking advantage of common hardware capabilities available on a wide range of platforms. WebAssembly is currently being designed as an open standard by a W3C Community Group. Whereas eWASM is a restricted subset of WASM to be used for contracts in Ethereum. For more details about the project, please click here
    5. Archive accumulators – as quoted in ethresear – Ethereum currently uses an accumulator (the Patricia-Merkle trie) which is designed for the state. There’s an alternative accumulator design which is an excellent match with the stateless client model, but it works for history only. By separating history and state, and encouraging the use of history versus state, we can make the stateless client model more practical and scalable than initially thought.
    6. Storage rent – Adding a “rent fees” may help in moving out the data from blockchain that may help in load balancing. An idea tossed by Vitalik Buterin, whereby users would be asked to pay to use the network based on how long they’d like their data to remain accessible on the blockchain.
    7. Bandwidth fees – Within a blockchain network, the bandwidth fees aim to establish fair pricing for both providers and consumers of blockchain resources. In Ethereum transactions three base resources have been identified – computation, network, and storage. Amongst these, computation currently enjoys the most robust such mechanism, with a two-sided gas market between miners and clients and relatively trustless derivative instruments enabling more advanced forms of speculation and risk With a bandwidth fees idea, the community is seeking to incentivize the p2p network layer.
  3. Phase 3: Light client state protocol
  4. Phase 4: Cross-shard transactions:
  5. Phase 5: Tight coupling with the main chain security:
  6. Phase 6: Super-quadratic or exponential sharding

Also, note that this is just an explanation to Phase 1 of Quadratic Sharding, Phase 2, 3,4,5 and super quadratic sharding [Ethereum 3.0 ]are currently out of scope.

Quadratic Sharding Phase 1

On implementing Sharding onto an Ethereum network, it would divide/group the nodes into smaller subsets that would have an ability to process a particular set of transaction’s, and also updating the global state of the network.  Ethereum breaks down the network into specific shards. Each shard is assigned a specific group of transactions that are determined by grouping specific accounts (including smart contracts) into a shard. Each transaction group has a header and a body that consist of the following.

Sharding 1

Header The shard ID of the transaction group

 

State Root State of the Merkle root of the shard before and after transactions added
Body All of the transactions that belong to the transaction group that is part of the specific shard.
Transaction Group Root It is the Merkle root of all of the transaction groups from the specific shards for that block of transactions
Each Block on Ethereum contains both a state root and the transaction group root.

So how a Global state update happens with shards in place?

A transaction that occurs within a specific shard, between accounts native to that shard is verified and state of the network changes, updating the storage, account balances, etc. Now comes the task to verify whether the transaction group is valid or not. For this to happen, the pre-state root of the transaction group should match with the shard root in the global state. Once the match occurs, the transaction group is validated, and that updates the global state via a particular shard ID state root.

And where does VMC fit in?

In the Main Blockchain where we publish a contract called as Validator Manager Contract [VMC] with an internal PoS[Proof Of Stake] consensus. The VMC also keeps a record of the shards. During each cycle, the VMC picks a random validator that has the right to create the next block on each shard.

Each shard has a certain number of blocks and transactions, that would NOT BE replicated to the main chain but would be replicated one level down. At each shard level, we have nodes referred to as Collations or a group of transactions, that has a collation header that would be pushed into the VMC. But all the actual transactions happening in the shards, shard states, shard collations would go off-chain.

So the main chain would only have collation headers, and VMC who need to keep track of these PoS signed block header to track the state roots of each shard. Also, the other validators of the shard have to watch VMC all the time for getting latest status and then verify if the transactions are valid too.

In this whole process, VMC for each shard is acting as a light client.

Ethereum Sharding Implementation – Future, timelines

As mentioned by Justin Drake that Casper would be launched first, i.e., in 2019, and Sharding Phase one [Data] and Phase two[Layer] would come in 2020 and 2021 respectively.

However, another team behind Status, a mobile Dapp browser on the Ethereum network, the crew has been building its own sharding client, Nimbus, to address scalability concerns with the blockchain.

Also, another platform Zilliqa has also implemented Sharding by utilizing transaction and computational sharding.

References

Sharding FAQ, Medium@icebearhww, Sharding doc, Blockgeeks, Sharding Roadmap, infographic, prysmatic-labs, trustnodes, Hiswai.