zkExchange

The Problem

Data is only anonymous with respect to its context. In our project, we particularly highlight the problems in modern-day verification. With the uprising of Data-Privacy Laws it's getting excessively hard to abide by policies, and take decisions over the legitimacies of certain outcomes.

One such problem is the Proof of Solvency. It is a mechanism in which verifying organisations such as Auditing Companies / Public Forums shall have a way to validate that the total amount of assets in a private/public bank is more than its total liabilities.

Every time a bank/business goes bankrupt, a common question that arises is whether we can rely on external auditing companies, government authenticators, or special policies and methods are needed to validate the Proof of Solvency, but in that process we end up giving away a lot of innocent user-data to these agencies solely relying on their word, that the data won't get laundered.

What if we didn't have to do that?

There could be several ways to mask/privatise the user-data, but the best way to anonymise is not reveal it AT ALL. We are planning to achieve this with one of the marvels of modern-day cryptography, that is, Zero-Knowledge Proofs.

A Background

Zero-knowledge proofs (ZKPs) are a type of cryptographic protocol that allows one party (the prover) to prove to another party (the verifier) that a certain statement is true, without revealing any additional information beyond the validity of the statement. In other words, ZKPs allow you to prove that you know a piece of information without actually revealing what that information is.

Click here to read more

Some important properties

Some properties of ZKPs include:

Completeness: If the statement is true, the verifier will always accept the proof provided by the prover.
Soundness: If the statement is false, no prover, no matter how powerful, can convince the verifier that it is true.
Zero-knowledge: The verifier learns nothing beyond the validity of the statement being proved.
Non-interactivity: The proof can be generated without interaction between the prover and the verifier, or with only a limited amount of interaction.

ZKPs have a wide range of applications, such as in blockchain technology, authentication systems, and secure communication protocols, where it is important to prove the authenticity of information without revealing sensitive details.

An-interactive-zero-knowledge-protocol

ZKP Implemenation Overview: Using zkSNARKs

zk-SNARKs (Zero-Knowledge Succinct Non-Interactive Argument of Knowledge) are a specific type of ZKP that are designed to be highly efficient and allow for very fast verification of proofs.

One of the key advantages of zk-SNARKs is that they are able to prove the validity of a statement without requiring any interaction between the prover and the verifier. This means that once a proof has been generated, it can be easily verified by anyone without the need for further communication with the prover.

Another important feature of zk-SNARKs is that they are “succinct,” meaning that the proof itself is very small, usually just a few hundred bytes.

Our Solution: zkExchange

A 3rd party auditing software that ingests data through csv balance sheets and proves the information regarding assets and ongoing liabilities without revealing any data at all to the verifier, thereby abiding by sensitive data-privacy laws, using Zero-Knowledge Proofs

In simple terms, here the proof is nothing but a file containing a token of legitimacy, for the audited company (the Prover), generated in WebAssembly or a C++ Executable, that will be provided to the Verification API provided to the auditing (the Verifier) side. Thereby, the Verifier can get a authentication status whether the Proof gets verified or not without having to look into or audit the real data. Moreover, since zk-SNARKS are non-interactive, the Verifier never really has to interact with the Prover (or the real data) to authenticate.

Algorithm

The Prover's Side

The software initially ingests the data, preferrably csv/xls balance sheets consisting the users’ sensitive information, structures it to a more workable form.
Feeds it into the ZKP cryptosystem and generates a proof in O(N logN) time.
Passes it over to the Verification API either internally or as a data packet via a HTTP response.

The Verifier's Side

The Verifier receives it and verifies the proof in O(1) time.

Our Project folders are structured in the following way:

/zkExchange
- zkDataPrep/
- zkCircuits/
- zkVerifier/
- README/

-zkDataPrep - An integral library written in Typescript, that decomposes the datasets by hashing and storing them in a Merkle Sum Tree, one of the fastest ways to prepare data before feeding them to ZKP modules.

-zkCircuits - The primary module that feeds the user data into the cryptographic system, and thereby generates a proof which is furthermore entirely independent of the data.

-zkVerfier - The verification library that will essentially work as a scalable API and shall be provided to whoever applies for a Proof of Solvency over the bank or agency.

-README - Overview and Documentation

zkDataPrep

Structuring datasets into Merkle Sum Trees

This is a Typescript (backend-friendly) library used to create Merkle Sum Trees where each node entry is username -> balance. The root of this tree contains the aggregated sum of all the node entries, thereby, representing the total assets/liabilities of a firm.

But coming to the point.

What is a Merkle Sum Tree?

A merkle sum tree is a binary Merkle Tree with the following properties:

Each entry of the merkle tree is a struct, of username and balance
Each leaf node of the merkle tree contains a hash and a sum. The hash is equal to h(username, balance). The sum is equal to the balance itself, at the node level.
Each non-leaf node contains a hash and a sum. The hash is equal to h(left_child.hash, left_child.sum, right_child.hash, right_child.sum). The sum is therefore equal to the sum of the sum of it’s children.
The root node represents the executed/constructed state of the tree and contains all the users’/entry balances. The merkleSumTree() is a Typescript implementation of a Merkle Sum Tree and it provides all the functions to construct the above mentioned modified Merkle Tree by just ingesting data from a csv file, picking up only the username -> balance.

This diagram is a representation of a similar Merkle Sum Tree

Hashing Algorihtm

In this implementation we’re using Poseidon Hash, a zkSNARK friendly hashing algorithm introduced in 2019, compatible with Secure Multi-Party Computation (MPC).

Why Poseidon?

Permutation-based Design: Poseidon Hash is built upon a permutation-based design, utilizing a series of permutation rounds to process the input data. The permutation function employed in Poseidon is based on the Merkle-Damgård construction.
Customizable Parameters: Poseidon allows for customizable parameters, such as the number of permutation rounds, the number of words processed per round, and the size of the internal state. These parameters can be adjusted depending on the specific security requirements and performance considerations of the application.
Efficient Arithmetic Circuit Implementation: Poseidon is highly optimized for arithmetic circuits commonly used in zk-SNARKs. It exhibits efficient behavior when implemented within these circuits, offering faster computation times and reduced circuit sizes compared to other hash functions.
Resistance to Cryptographic Attacks: Poseidon has undergone extensive analysis and evaluation for security. It has been designed to resist known cryptographic attacks, including differential and linear attacks. However, as with any cryptographic primitive, ongoing analysis and scrutiny are essential to ensure its continued security.
Trusted Setup Compatibility: Poseidon is compatible with trusted setup procedures commonly used in certain zk-SNARK constructions. The structure of Poseidon allows for efficient and secure generation of the necessary cryptographic parameters during the trusted setup phase.
Multiple Variants: Poseidon comes in different variants, such as Poseidon-128, Poseidon-192, and Poseidon-256, which differ in the number of rounds and the width of the internal state. These variants offer flexibility in choosing the appropriate security level and performance trade-offs.

Setup

Installing the package

The package can be directly installed from the npm registry with the following command:
```
$ npm i @0xagnish/zk-data-prep
```

For Database

Import your database to a csv file, for example this

API Guide

new MerkleSumTree (pathToCsv : string) : MerkleSumTree

import { MerkleSumTree } from "0xagnish/zk-data-prep";

const pathToCsv = "test/entryPatterns/entry-16-valid.csv";

const tree = new MerkleSumTree(pathToCsv);
//constructs a tree using the given entries in the csv dataset

entries: [] Entry

The entries contain the username parsed as BigInt. This transformation is needed in order to make the entries hashing friendly and later on operate with zkSNARKs.

const entries = tree.entries;
// [
//       Entry {
//         _usernameToBigInt: 7440338505707899769n,
//         _balance: 7534n,
//         _username: 'gAdsIaKy'
//       },
//       Entry {
//         _usernameToBigInt: 6008493982388733799n,
//         _balance: 2060n,
//         _username: 'SbuqOZGg'
//       },
//       ...
// ]

leaves : [] Node

const root = tree.root;
// {
//   hash: 5256203632563331423195629050622063453704745190370457907459595269961493651429n,
//   sum: 84359n
// }

indexOf (username: string, balance: bigint) : number

Returns the index of an entry in the tree. If the entry does not exist in the tree, it returns -1.

const index = tree.indexOf("gAdsIaKy", BigInt(7534)); // 0

createProof (index: number) : MerkleProof

Creates a proof of membership for an entry identified by its index. The MerkleProof contains the path from the leaf to the root.

const proof = tree.createProof(0);

verifyProof(proof: MerkleProof) : boolean

Verifies a proof and returns true or false. It verifies that a leaf is included in the tree and that the sum computed from the leaf to the root is equal to the total sum of the tree.

tree.verifyProof(proof);

Benchmarking

To construct a Merkle Sum Tree with 262144 (2**18 leaves) it takes:

Time taken	OS/Kernel	RAM
108s	Ubuntu 22 LTS	8 GB
154s	Macbook Air M1	8 GB

zkSNARK construction on Circom

What is Circom?

Circom, short for Circuit Compiler, is a domain-specific language (DSL) and compiler designed for creating arithmetic circuits. It is commonly used in the field of zero-knowledge proofs (ZKPs) and secure multiparty computation (MPC).

What are these circuits?

Arithmetic circuits are mathematical representations of computations, where inputs and outputs are represented as wires, and gates perform operations on these wires. Circom allows you to express complex computations as circuits in a high-level language, making it easier to reason about and analyze the behavior of these circuits.

Importance of Circom

Circom is often used in conjunction with other tools and libraries in the area of ZKPs, such as zk-SNARKs (zero-knowledge succinct non-interactive arguments of knowledge). These cryptographic constructions enable the verification of computations without revealing the inputs or intermediate values, providing privacy and security guarantees.

What can we do with Circom?

By using Circom, developers can define the desired computation, compile it into an arithmetic circuit, and then generate the necessary proofs or verification keys to interact with the circuit. This process allows for the creation of privacy-preserving applications and protocols where sensitive data can be processed securely without exposing its contents.

You can find the Research Paper for Circom here

Circuits for Proof Of Solvency.

The circuit, written in circom, enforces the rules that the Exchange must abide by when generating a Proof Of Solvency for a specific user.

The circuit checks that:

- A user-balance entry has been included in the Merkle Sum Tree
- The computation of the sum going from the user's entry to the root has been performed correctly
- No sum overflow happened during the computation
- The computed sum (namely the total liabilities of an exchange) is less or equal to the total sum of the assets of the exchange

The prover system guarantees credible and self-auditable proof while preserving the secrecy of the Exchange’s business information such as:

Number of users of the exchanges
Users balances
Siblings partial sum balances
Total liabilities of the exchange

The prover relies on zkDataPrep for the Merkle Sum tree operations.

Circuit Design

Input	Description	Public or Private
rootHash	Root Hash of the Merkle Sum Tree publicly committed by the exchange	Public
username	The username (in BigInt format) of user to which the proof is being generated for	Private
balance	The balance of the user to which the proof is being generated for	Private
pathIndices[nLevels]	A bit array that contains the path to the user leaf inside the Merkle Sum Tree	Private
siblingHashes[nLevels]	Array of hashes of the siblings of the user leaf	Private
siblingsSums[nLevels]	Array of sum-balances of the siblings of the user leaf	Private
assetsSum	The total assets that the Exchange claims to have	Public

Output	Description	Public or Private
leafHash	Poseidon Hash `H(username,balance)`	Public (by default)

The ToLeafHash component performs the poseidon hash of the username and the balance and outputs the leafHash. The leafHash is then used as the first hash in the NextMerkleSumTreeLevel component.

The NextMerkleSumTreeLevel component recursively computes the current hash (for the first level it is the leafHash), the current sum (for the first level it is the balance), the current siblingHash and the current siblingSum. The output of the nextLevel component are the nextHash and the nextSum. These are calculated as follows:

nextHash = H(hash, sum, siblingHash, siblingSum) if the pathIndex is 0, where H is the poseidon hash function
nextHash = H(siblingHash, siblingSum, hash, sum) if the pathIndex is 1, where H is the poseidon hash function
nextSum = sum + siblingSum

After the last level is computed, the circuit checks that the nextHash is equal to the rootHash and that the nextSum is LessEqThan the assetsSum.

Further circuit components not shown in the circuit diagram are:

SafeSum, ensures that no overflow happens during the computation of the sum
SafeLessEqThan, safely compare two n-bit numbers avoiding overflows

Checks to be executed outside the circuit

A proof generated using the circuit, even if verified, doesn’t ensure that the prover is solvent. Further checks must be on the public signals of the circuit to ensure that the prover is solvent. These checks are:

The rootHash (input of the circuit) must be the root hash of the Merkle Sum Tree committed by the exchange on a Public Bulletin Board
The assetsSum (input of the circuit) must be the total assets of the exchange. The way in which the exchange generates its proof of assets is out of the scope of this project.
The leafHash (output of the circuit) must equal to H(username, balance) that contains the data of the user to which the proof is being generated for

Workflow of the Circom Compiler and it’s Dependencies

Required Dependency

circom

Trusted Setup for Multi-party Computation

The entire Trusted Setup is deployed over an S3 bucket on AWS, so that the same Trusted Setup can be re-used for secure MPC, while building the zkSNARK circuit it automatically downloads the setup files from AWS.

Setup

Installing the package

The package can be directly installed from the npm registry with the following command:
```
$ npm i @0xagnish/zkCircuits
```

Build

In order to compile the circuit, execute the trusted setup, generate the proof (and verify it) using groth16 as proving system run from the root directory:

$ npm run build

The script will:

Download the trusted Powers Of Tau setup generated from the Hermez Community
Do the trusted setup required for the groth16 proving system
Compile the circuit

Generate a witness based on a pre generated sample input. In order to generate other inputs you can use this program:


const { IncrementalMerkleSumTree } = require("ts-merkle-sum-tree")

...

proof = tree.createProofWithTargetSum(5, BigInt(125))

inputToCircuit = JSON.strigify(proof)

Generate the proof based on the witness
Verify the proof

Test

To run the tests, run the following command:

$ npm test

Benchmarks

All benchmarks are run on a Ubuntu 22 LTS, 8GB memory. The benchmark was run on a Merkle Sum Tree with 16 levels (2^16 leaves).

	groth16
Constraints	13892
Circuit compilation	2s
Witness generation	0s
Setup key generation	40s
Trusted setup phase 2 contribution	6s
Proving key size	12.3MB
Proving key verification	41s
Proving time	2s
Proof verification time	0s

Published Jul 11, 2023

Staring into @go_ethereum, zkEVM @cipherem, Rollups-as-a-service @trueZkAgnish Ghosh on Twitter