Data is only anonymous with respect to its context. In our project, we particularly highlight the problems in modern-day verification. With the uprising of Data-Privacy Laws it's getting excessively hard to abide by policies, and take decisions over the legitimacies of certain outcomes.
One such problem is the Proof of Solvency. It is a mechanism in which verifying organisations such as Auditing Companies / Public Forums shall have a way to validate that the total amount of assets in a private/public bank is more than its total liabilities.
Every time a bank/business goes bankrupt, a common question that arises is whether we can rely on external auditing companies, government authenticators, or special policies and methods are needed to validate the Proof of Solvency, but in that process we end up giving away a lot of innocent user-data to these agencies solely relying on their word, that the data won't get laundered.
There could be several ways to mask/privatise the user-data, but the best way to anonymise is not reveal it AT ALL. We are planning to achieve this with one of the marvels of modern-day cryptography, that is, Zero-Knowledge Proofs.
Zero-knowledge proofs (ZKPs) are a type of cryptographic protocol that allows one party (the prover) to prove to another party (the verifier) that a certain statement is true, without revealing any additional information beyond the validity of the statement. In other words, ZKPs allow you to prove that you know a piece of information without actually revealing what that information is.
Some properties of ZKPs include:
Completeness: If the statement is true, the verifier will always accept the proof provided by the prover.
Soundness: If the statement is false, no prover, no matter how powerful, can convince the verifier that it is true.
Zero-knowledge: The verifier learns nothing beyond the validity of the statement being proved.
Non-interactivity: The proof can be generated without interaction between the prover and the verifier, or with only a limited amount of interaction.
ZKPs have a wide range of applications, such as in blockchain technology, authentication systems, and secure communication protocols, where it is important to prove the authenticity of information without revealing sensitive details.
zk-SNARKs (Zero-Knowledge Succinct Non-Interactive Argument of Knowledge) are a specific type of ZKP that are designed to be highly efficient and allow for very fast verification of proofs.
One of the key advantages of zk-SNARKs is that they are able to prove the validity of a statement without requiring any interaction between the prover and the verifier. This means that once a proof has been generated, it can be easily verified by anyone without the need for further communication with the prover.
Another important feature of zk-SNARKs is that they are “succinct,” meaning that the proof itself is very small, usually just a few hundred bytes.
In simple terms, here the proof is nothing but a file containing a token of legitimacy, for the audited company (the Prover), generated in WebAssembly or a C++ Executable, that will be provided to the Verification API provided to the auditing (the Verifier) side. Thereby, the Verifier can get a authentication status whether the Proof gets verified or not without having to look into or audit the real data. Moreover, since zk-SNARKS are non-interactive, the Verifier never really has to interact with the Prover (or the real data) to authenticate.
Our Project folders are structured in the following way:
/zkExchange
zkDataPrep/
zkCircuits/
zkVerifier/
README/
-zkDataPrep
- An integral library written in Typescript, that decomposes the datasets by hashing and storing them in a Merkle Sum Tree, one of the fastest ways to prepare data before feeding them to ZKP modules.
-zkCircuits
- The primary module that feeds the user data into the cryptographic system, and thereby generates a proof which is furthermore entirely independent of the data.
-zkVerfier
- The verification library that will essentially work as a scalable API and shall be provided to whoever applies for a Proof of Solvency over the bank or agency.
-README
- Overview and Documentation
This is a Typescript (backend-friendly) library used to create Merkle Sum Trees where each node entry is username -> balance
. The root of this tree contains the aggregated sum of all the node entries, thereby, representing the total assets/liabilities of a firm.
But coming to the point.
A merkle sum tree is a binary Merkle Tree with the following properties:
h(username, balance)
. The sum is equal to the balance itself, at the node level.h(left_child.hash, left_child.sum, right_child.hash, right_child.sum)
. The sum is therefore equal to the sum of the sum of it’s children.merkleSumTree()
is a Typescript implementation of a Merkle Sum Tree and it provides all the functions to construct the above mentioned modified Merkle Tree by just ingesting data from a csv file, picking up only the username -> balance
.This diagram is a representation of a similar Merkle Sum Tree
In this implementation we’re using Poseidon Hash, a zkSNARK friendly hashing algorithm introduced in 2019, compatible with Secure Multi-Party Computation (MPC).
Permutation-based Design: Poseidon Hash is built upon a permutation-based design, utilizing a series of permutation rounds to process the input data. The permutation function employed in Poseidon is based on the Merkle-Damgård construction.
Customizable Parameters: Poseidon allows for customizable parameters, such as the number of permutation rounds, the number of words processed per round, and the size of the internal state. These parameters can be adjusted depending on the specific security requirements and performance considerations of the application.
Efficient Arithmetic Circuit Implementation: Poseidon is highly optimized for arithmetic circuits commonly used in zk-SNARKs. It exhibits efficient behavior when implemented within these circuits, offering faster computation times and reduced circuit sizes compared to other hash functions.
Resistance to Cryptographic Attacks: Poseidon has undergone extensive analysis and evaluation for security. It has been designed to resist known cryptographic attacks, including differential and linear attacks. However, as with any cryptographic primitive, ongoing analysis and scrutiny are essential to ensure its continued security.
Trusted Setup Compatibility: Poseidon is compatible with trusted setup procedures commonly used in certain zk-SNARK constructions. The structure of Poseidon allows for efficient and secure generation of the necessary cryptographic parameters during the trusted setup phase.
Multiple Variants: Poseidon comes in different variants, such as Poseidon-128, Poseidon-192, and Poseidon-256, which differ in the number of rounds and the width of the internal state. These variants offer flexibility in choosing the appropriate security level and performance trade-offs.
$ npm i @0xagnish/zk-data-prep
import { MerkleSumTree } from "0xagnish/zk-data-prep";
const pathToCsv = "test/entryPatterns/entry-16-valid.csv";
const tree = new MerkleSumTree(pathToCsv);
//constructs a tree using the given entries in the csv dataset
The entries contain the username parsed as BigInt. This transformation is needed in order to make the entries hashing friendly and later on operate with zkSNARKs.
const entries = tree.entries;
// [
// Entry {
// _usernameToBigInt: 7440338505707899769n,
// _balance: 7534n,
// _username: 'gAdsIaKy'
// },
// Entry {
// _usernameToBigInt: 6008493982388733799n,
// _balance: 2060n,
// _username: 'SbuqOZGg'
// },
// ...
// ]
const root = tree.root;
// {
// hash: 5256203632563331423195629050622063453704745190370457907459595269961493651429n,
// sum: 84359n
// }
Returns the index of an entry in the tree. If the entry does not exist in the tree, it returns -1.
const index = tree.indexOf("gAdsIaKy", BigInt(7534)); // 0
Creates a proof of membership for an entry identified by its index. The MerkleProof contains the path from the leaf to the root.
const proof = tree.createProof(0);
Verifies a proof and returns true or false. It verifies that a leaf is included in the tree and that the sum computed from the leaf to the root is equal to the total sum of the tree.
tree.verifyProof(proof);
To construct a Merkle Sum Tree with 262144 (2**18 leaves) it takes:
Time taken | OS/Kernel | RAM |
---|---|---|
108s | Ubuntu 22 LTS | 8 GB |
154s | Macbook Air M1 | 8 GB |
Circom, short for Circuit Compiler, is a domain-specific language (DSL) and compiler designed for creating arithmetic circuits. It is commonly used in the field of zero-knowledge proofs (ZKPs) and secure multiparty computation (MPC).
Arithmetic circuits are mathematical representations of computations, where inputs and outputs are represented as wires, and gates perform operations on these wires. Circom allows you to express complex computations as circuits in a high-level language, making it easier to reason about and analyze the behavior of these circuits.
Circom is often used in conjunction with other tools and libraries in the area of ZKPs, such as zk-SNARKs (zero-knowledge succinct non-interactive arguments of knowledge). These cryptographic constructions enable the verification of computations without revealing the inputs or intermediate values, providing privacy and security guarantees.
By using Circom, developers can define the desired computation, compile it into an arithmetic circuit, and then generate the necessary proofs or verification keys to interact with the circuit. This process allows for the creation of privacy-preserving applications and protocols where sensitive data can be processed securely without exposing its contents.
You can find the Research Paper for Circom here
Circuits for Proof Of Solvency.
The circuit, written in circom, enforces the rules that the Exchange must abide by when generating a Proof Of Solvency for a specific user.
The circuit checks that:
- A user-balance entry has been included in the Merkle Sum Tree
- The computation of the sum going from the user's entry to the root has been performed correctly
- No sum overflow happened during the computation
- The computed sum (namely the total liabilities of an exchange) is less or equal to the total sum of the assets of the exchange
The prover system guarantees credible and self-auditable proof while preserving the secrecy of the Exchange’s business information such as:
The prover relies on zkDataPrep for the Merkle Sum tree operations.
Input | Description | Public or Private |
---|---|---|
rootHash | Root Hash of the Merkle Sum Tree publicly committed by the exchange | Public |
username | The username (in BigInt format) of user to which the proof is being generated for | Private |
balance | The balance of the user to which the proof is being generated for | Private |
pathIndices[nLevels] | A bit array that contains the path to the user leaf inside the Merkle Sum Tree | Private |
siblingHashes[nLevels] | Array of hashes of the siblings of the user leaf | Private |
siblingsSums[nLevels] | Array of sum-balances of the siblings of the user leaf | Private |
assetsSum | The total assets that the Exchange claims to have | Public |
Output | Description | Public or Private |
---|---|---|
leafHash | Poseidon Hash H(username,balance) |
Public (by default) |
The ToLeafHash
component performs the poseidon hash of the username
and the balance
and outputs the leafHash
. The leafHash
is then used as the first hash
in the NextMerkleSumTreeLevel
component.
The NextMerkleSumTreeLevel
component recursively computes the current hash
(for the first level it is the leafHash
), the current sum
(for the first level it is the balance
), the current siblingHash
and the current siblingSum
. The output of the nextLevel component are the nextHash
and the nextSum
. These are calculated as follows:
nextHash = H(hash, sum, siblingHash, siblingSum)
if the pathIndex is 0, where H is the poseidon hash functionnextHash = H(siblingHash, siblingSum, hash, sum)
if the pathIndex is 1, where H is the poseidon hash functionnextSum = sum + siblingSum
After the last level is computed, the circuit checks that the nextHash
is equal to the rootHash
and that the nextSum
is LessEqThan
the assetsSum
.
Further circuit components not shown in the circuit diagram are:
SafeSum
, ensures that no overflow happens during the computation of the sumSafeLessEqThan
, safely compare two n-bit numbers avoiding overflowsA proof generated using the circuit, even if verified, doesn’t ensure that the prover is solvent. Further checks must be on the public signals of the circuit to ensure that the prover is solvent. These checks are:
rootHash
(input of the circuit) must be the root hash of the Merkle Sum Tree committed by the exchange on a Public Bulletin BoardassetsSum
(input of the circuit) must be the total assets of the exchange. The way in which the exchange generates its proof of assets is out of the scope of this project.leafHash
(output of the circuit) must equal to H(username, balance)
that contains the data of the user to which the proof is being generated forThe entire Trusted Setup is deployed over an S3 bucket
on AWS, so that the same Trusted Setup can be re-used for secure MPC, while building the zkSNARK circuit it automatically downloads the setup files from AWS.
$ npm i @0xagnish/zkCircuits
In order to compile the circuit, execute the trusted setup, generate the proof (and verify it) using groth16 as proving system run from the root directory:
$ npm run build
The script will:
Download the trusted Powers Of Tau setup generated from the Hermez Community
Do the trusted setup required for the groth16 proving system
Compile the circuit
Generate a witness based on a pre generated sample input. In order to generate other inputs you can use this program:
const { IncrementalMerkleSumTree } = require("ts-merkle-sum-tree")
...
proof = tree.createProofWithTargetSum(5, BigInt(125))
inputToCircuit = JSON.strigify(proof)
Generate the proof based on the witness
Verify the proof
To run the tests, run the following command:
$ npm test
All benchmarks are run on a Ubuntu 22 LTS, 8GB memory. The benchmark was run on a Merkle Sum Tree with 16 levels (2^16 leaves).
groth16 | |
---|---|
Constraints | 13892 |
Circuit compilation | 2s |
Witness generation | 0s |
Setup key generation | 40s |
Trusted setup phase 2 contribution | 6s |
Proving key size | 12.3MB |
Proving key verification | 41s |
Proving time | 2s |
Proof verification time | 0s |