If the math crashes in the website, read this
Hi there, it’s Agnish again. After racking my head over ALOT of topics, and even more github issues, I am finally able to narrow down, what I REALLY want to contribute and work on for this EPF Cohort.
So, this week I have dived deep into Verkle trees and how the roadmap of Stateless Ethereum is being planned. I’m doing an overall update of whatever I’ve learnt in this domain for the past week.
I started off with Vitalik’s first Verkle Tree post, he repeatedly referred to the first by Verkle Tree paper by Kuszmaul back in 2018. He also discussed about the merits of VKT over Merkle trees with the most important ones being :
index of the of the particular child
and it's path leading towards the root
.So here’s a basic comparison table stating the exhaustive details of PROOF SIZES
for both MKT and VKTs:
Trie | Specifics in each proof | Levels | Proof Size | Cryptographic Reqs |
---|---|---|---|---|
Merkle Patricia Trie | leaf data + 15 siblings, 32 bytes for each level | ~7 | ~3.5 MB for 1,000 leaves | Collision-resistant hash functions |
Verkle Trie | leaf data + Polynomial Comm + value + index of child, which is 32 bytes , another 32 bytes + 1 byte + small constant size data |
~4 | ~150 KB for 1,000 leaves | Earlier (for Eth1 State): KZG Polynomial Comms with BLS12_381 curve, Now (for Eth2): Pedersen Commitments with Bandersnatch Modulus |
In real world scenarios, we can use a polynomial commitment
as a vector commitment
to generate proof for the root of a Verkle Trie.
Polynomial Commitments let’s us hash a polynomial, and make a proof for it’s evaluation and any
point, usually we call that an opening
.
First we need to agree upon a standardized set of coordinates, let us call it $[c_1,c_2,c_3,…,c_n]$, given a list of final points $[y_1,y_2,y_3,…,y_n]$, we can commit to the polynomial P such that $P(c_i) = y_i$ for all $i ∈ [1…n]$. We can find this particular polynomial using Lagrange’s Interpolation method.
This method is used to interpolate a given dataset and find the lowest possible degree polynomial that passes through the points of the dataset.
The Lagrange Interpolation for points/nodes $[y_0,y_1,…,y_k]$, then the linear combination of these for points $[y_i]$ should look like -
$$\sum_{j=0}^{k} {y_j*l_j(x)}$$
where, $l_j(x)$ is the Lagrangian Basis for the polynomial.
$$ \mathcal{l_j}(x) = \prod_{m \neq j, m >= 0}\frac{x -x_m}{x_j - x_m} $$
To understand this better, you can tweak with this, Desmos graphing calculator.
Here, I have plotted 4 arbitrary points, $a,b,c,d$ and have plotted the Lagrangian Basis for point a.
This was the given polynomial, with $L_a$:
And hence, the curve would look like this:
This Lagrange Interpolation method can be used for computing polynomial commitments for both KZG and Bulletproof-style commitments (IPA).
VKT for Eth1 was largely based on KZG Polynomial Commitment Schemes. Initially, in a Merkle Tree of depth d
, we could compute a commitment to a vector for elements in the Merkle Tree, which is, $a_0,a_1,…,a_2^d-1$.
By the definition of Polynomial Commitments, we define a polynomial $p(X)$ of degree $n$ is a function:
$$\sum_{i=0}^{n}p_iX^i$$
where $p_i$ are the coefficients of the corresponding polynomial.
Also, here the degree of the polynomial is the depth of the tree $n = 2^d - 1$, and by assigning $a_i = p_i$ and computing the proof of the coefficients, the prover wants to show the verifier that $p(z) = y$ for some $z$. The prover can do this by sending the verifier all such $p_i$ and the verifier can compute indeed, $p(z) = y$.
Let $\mathbb{G_1}$ and $\mathbb{G_2}$ be 2 elliptic curves with a pairing $e : \mathbb{G_1} × \mathbb{G_2} -> \mathbb{G_T}$. Let $p$ be the order of $\mathbb{G_1}$ and $\mathbb{G_2}$, and $G$ and $H$ be generators of $\mathbb{G_1}$ and $\mathbb{G_2}$. We can represent that in the following ways:
$$[x]_1 = xG ∈ \mathbb{G_1}$$
and
$$[x]_2 = xH ∈ \mathbb{G_2}$$
for any setup $x ∈ \mathbb{F}_p$.
Now, let us assume that we have a trusted setup, so that for some secret $s$, the elements $[s^i]_1$ and $[s^i]_2$ are available to both the prover and the verifier for all $i = 0,…,n-1$.
Now, with ecc it’s impossible to find out what actually is from the trusted setup group elements. It’s a number in the finite field, but the prover cannot actually know the original number. They can only perform a certain set of group operations.
Hence, given that if $p(X) = \sum_{i=0}^{n} p_iX^i$ is a polynomial, then the prover can compute the following:
$$[p(s)]1 = [\sum{i=0}^{n}p_is^i] = \sum_{i=0}^{n}p_i[s^i]_1$$ Hence this can be represented as a group element.
In this way, everyone can evaluate a polynomial without knowing really what $s$ is, they only get an elliptic curve point such that $[p(s)]_1 = p(s)G$.
Let $g$ and $h$ be elements of $G_q$ such that nobody knows what $log_g(h)$ is. These elements can either be chosen by a trusted center, when the system is initialized, or by (some of) the participants using a randomizing protocol.
These entire process DOES NOT need a trusted setup ceremony like KZG.
Note that here $p$ and $q$ denotes large primes such that $q$ divides $p-1$, and $G_q$ is a unique subgroup of $\mathbb{Z^*_p}$ of the order $q$, and $g$ is a generator of $G_q$. It can easily be tested if an element $a ∈ \mathbb{Z_q}$ is in $G_q$ since, $a ∈ G_q <=> a^q = 1.$ As any element $b$ != 1 in $G_q$ generates the same group, the discrete log of $a ∈ G_q$ with repsect to the base $b$ is defined and it is denoted by $log_b(a)$.
The commiter commits herself to an $s ∈ \mathbb{Z_q}$ by choosing $t ∈ \mathbb{Z_q}$ at random and she computes the following,
$$E(s,t) = g^sh^t.$$
Such a commitment can be again revealed later by revealing $s$ and $t$. Then the following theorem is very easy to prove and shows that $E(s,t)$ reveals no information about $s$ at all, and that the committer cannot open a commitment to $s$ as $s’ != s$ unless she can find what $log_g(h)$.
I am yet to read more about the merits of Bandersnatch Modulus over BLS12_381 curve, how it is a 42%
improvement over the Jubjub curve and how it’s implemented using the GLV multi-scalar multiplication method, and the deeper math behind how Multiproofs are computed using Inner Product Arguments.
Hoping to cover all of that by this week! I hope to dive deeper and work on the Hyperledger Besu implementation Verkle Tries in this github repo. I hope draft improvements of this Github repo in my project proposal as well. Also, looking forward to understanding more on State Expiry, Address Size Extensioning for several periods of data on Ethereum, and finally Portal Networks.
Happy Reading!