Technical Background

Zero-knowledge Proof

A Zero-knowledge proof guarantees the validity or some characteristics of a statement without revealing the statement itself.

A Zero-knowledge protocol includes two main parties: the prover and verifier. They agree with each other on a public statement. However, the direct proof about this statement contains sensitive information that can not be shared between two parties. To deal with it, the prover creates an indirect proof which leaks none of these confidential details, while guaranteeing the statement is satisfied. Up until this point, there are many different methods to build such a protocol, but they must abide by 3 principles in common:

Correctness: if the input data satisfies the argument, the proof from it must be accepted by the verifier.
Soundness: if the input data violates the argument, the verifier must reject the proof from it.
Zero knowledge: the proof itself doesn't carry any information of the input data apart from the argument.

For some recent years, Zero-knowledge proof has been used widely for privacy-preserving in real-world applications, especially on Web3 platform, where the privacy of the users is highly vulnerable due to its transparent essence.

Groth16

Innovated in 2016 by Jens Groth, Groth16 is an implementation of ZK proof, which has been the most popular branch of ZKP due to some outstanding advantages:

Non-interactive: the proof generation process is executed merely on the side of the prover and does not require any mutual communication between the prover and verifier.
Succinctness: the proof size is fairly small as well as the verifier only has to execute a slight computation to verify the validity of the proof.

These edges of Groth16 make it perfectly suitable for blockchain-based applications, when most of the time, the verification must be conducted on the blockchain, which might cause additional fees if the protocol is interactive and the proof and verifying computation are quite verbose.

In Groth16, an argument about the input data is represented as the form of the circuits with a set of constraints, and the input data is called witnesses. Different inputs can satisfy the same argument, thus, a circuit can be used for different witnesses. The circuits later will be applied some cryptographic transformation to remove the trace of the witnesses and generate the proof. Inputs are divided into 2 categories: the private ones, and the public ones. Public inputs impact the succinctness of the verification, the more public inputs, the more computations the verifier must execute.

To achieve the succinctness, Groth16 requires a trusted setup for each circuit, which requires the liaison and trust between multiple parties. Practically, this process consumes a plethora of time and computation resources to make sure the protocol is confidently secure, which makes this algorithm only suitable for some systems with a few number of circuits.

Groth16 Batch Verification

In fact, using some cryptographic tricks, we can group a couple of proofs from the same circuit to verify at once. We only return true if all the proof in the batch is valid. This process tends to be conducts on-chain by the smart contract, thus, the efficiency of the verification is evaluated by the expense of gas as follow:

Number of ZK proofs

Total gas cost

Gas cost per proof

Every proof in the batch should be validated off-chain by a relayer first to filter out the fraud ones before being rolled up to the blockchain to make sure the batch will be passed, avoiding wasting gas cost.

ZK-Friendly Hash Function

Hash functions are the one-way data transformations in which the input data with an arbitrary length, through a series of complicated computations, produces a fixed output value. From only the output, it is infeasible to trace back to the original inputs.

Currently, some cryptographic hash functions such as Keccak, SHA, Blake are widely used in both smart contracts on blockchains and traditional computing environments. The optimization of such functions comes from the fact that they boil every value down to bits, which can be easily and quickly processed by the bit-based computation systems like EVM behind the smart contracts or most of the computers used in the present.

When it comes to arithmetization inside the ZK circuits, everything is counted as a number in a finite field, rather than bit. Therefore, bit-based hash functions turn out inefficient, and we have to design some hash functions that work directly with finite fields. This is when some modern ZK-friendly hash methods such as Poseidon, MIMC, Pedersen come into play. Among ZK-friendly hash functions, Poseidon stands out to be the most optimized option in many use cases. Using the \cite{sponge} construction, Poseidon allows arbitrary number of inputs as well as the outputs, enabling its versatility. That said, adjusting the number of inputs and outputs does affect the security of the function, thus the hash with a set of parameters should be used in consideration.

Merkle Tree

Merkle tree is a special hash-based data structure created by the scientist Ralph Merkle. Merkle tree has been the center of numerous cryptographic protocols, including EVM-based blockchain when the universal state of a block is computed as the Merkle hash of every leaf which contains the state of the balances or values in the smart contracts.

At its core, Merkle tree is an implementation of cryptographic accumulation. Given a set of different elements, the target is to design a protocol with 2 parties: prover and verifier, in which the prover, knowing an element which can either belong or not belong to the set, convinces the verifier about the inclusion or exclusion of that element in the given set. So why do we need the Merkle tree? The thing is, by and large, the verification should be succinct on the side of the verifier with the cost of additional computation on the side of the prover, for the same reason as Groth16, particularly for memory optimization. By using the Merkle tree, the verifier only has to store the root hash, instead of a whole set. In the next subsections, we will go over different forms of Merkle tree and the specified method for membership and non-membership verification of each one.

Fixed Merkle Tree

As arranging the elements in the set to an array, each two consecutive leaves are paired with each other, applied a hash function, producing an inner node. As a result, we have a new set of inner nodes, with the force remaining half in comparison with the initial set. Keeping the same order for the new-born inner nodes, we repeat the same process until only one node is left, which is called the root of the tree. If the input set has $2^n$ elements, the height of the tree will be $n$ . A set received after each transformation is indexed by a level, which is 0 for the initial set, and increases one through each next inner node set when we go up towards the root. An inner node which is the hash of 2 leaves or 2 inner nodes in the lower level is called the parent, while these 2 leaves or inner nodes are called the children of the higher node, and the sibling of each other.

The project uses the non-updateable version of the fixed Merkle tree, that means we must accumulate the whole set right from the beginning, as the tree is completed, there is no more insertion, update, or deletion. Due to this inflexibility, we can only use fixed merkle trees in a limited number of use cases.

Set membership

It is quite simple to prove the membership of a leaf in a tree. The prover only has to provide a proof containing the leaf itself, every sibling along the path from this leaf to the root, and an index path that indicates the position of the leaf. Receiving that proof, verifier conducts accumulating the leaf with each sibling in order, when the next sibling is hashed with the hash result of the previous sibling and the accumulator, given the initial assignment for the accumulator as the leaf. The final result of the accumulator must equal to the root for the proof to be valid. With the number of elements in the set as $2^n$ , according to the set membership algorithm, the verifier requires $n$ hashes, and $n$ data units to store the proof.

Set non-membership

Natively, a fixed Merkle tree does not support non-membership verification. We can adjust the order of the leaves as ascending or descending for this purpose, though. At first, the construction of the tree should be agreed and published among parties, with the position of each leaf determined by its position in chronological order, to guarantee the security of the process. The membership validation process is not affected by the order of the leaves, thus, it still remains the same. For non-membership, with a leaf which is excluded from the tree, there are 3 situations:

The value of the excluded leaf is less than any leaf in the tree: the prover picks the leftist (or rightist in descending order) leaf which has the smallest value, generating inclusion proof of it, and shows that the excluded leaf is less than it.
There are two consecutive leaves in the tree that the value of the excluded leaf falls between them: the prover picks these two leaves and provides the inclusion proofs for both of them, and then shows to the verifier that the value of the excluded leaf is between them.
The value of the excluded leaf is greater than any leaf in the tree: the prover picks the rightist leaf (or leftist in descending order) which has the greatest value, generating inclusion proof of it, and shows that the excluded leaf is less than it.

In the worst scenario, which is the second one, the verifier must conduct $2n$ hash functions, while with the first and third, they only have to do $n$ times.

Sparse Merkle Tree

Sparse Merkle tree has been utilized by a wide range of applications on the blockchain due to its prominent advantages compared to the former variant:

Supporting non-membership validation in nature with efficiency: we can proof the exclusion of a leaf in $o(n)$ in the worst-case scenario.
Continuous and sparse: continuous means it is updatable, we can add, change, delete the leaves, and then change the root, while sparse means when the number of current leaves is relatively small than the capability of the tree, the number of inner nodes is small as well, which optimized the data storage for the prover.

At its core, a sparse Merkle tree is just like a typical Merkle tree, except the index now becomes a part of the leaf's content. Even if two leaves share the same value, as long as their indexes are different, they are still non-identical. Moreover, when adding a leaf to a tree, that leaf does not have to be in the lowest level of the tree, in other words, the leaf will go from the root down, and stop as soon as it passes every inner node along the path. That optimizes the space for storing a tree, and also is the main reason why this tree is called "sparse".

The above diagram shows an SMT with the depth being 5. It only stores to the maximum level of 3, though. The membership proving and verifying processes are similar with the fixed variant. While with non-membership, we have to point out an existing leaf that either shares the same index but different value with the excluded leaf, or lies on the path from the root to it.

Quinary Sparse Merkle Tree

Utilizing the characteristic of Poseidon hasher, which has the capability of squeezing a set of inputs that has more than two elements quite effectively. In fact, the more the number of inputs, the smaller the number of ZK constraints the hash function consumed on each input, averagely.

This supports the idea of broadening the width of the SMT, considering the increase in the size of the proof.

Balancing between the expense of the ZK constraints and the size of the proof, a quinary SMT, which has the arity of 5, is quite optimal to be used in ZK circuits. With the integration of this tree, we can significantly reduce the number of constraints, resulting in better performance for the prover. In this protocol, for every SMT we use the quinary version with depth being 14 to achieve the similar capability with the binary version with depth being 32.

Signature

In the decentralized world, authentication tends to be implemented in the form of a digital signature. Owning a key pair including a private key and a public key, a user uses the private one to sign on a given message, producing a signature, then anyone else, knowing the public key, can validate the message with a signature is actually from that specific user.

The most popular signing mechanism utilized widely on blockchains currently is ECDSA, which uses the secp256k1 elliptic curve as a key space. However, when it comes to arithmetization inside the ZK circuit, secp256k1 turns out inefficient as the conflict between it and the Bn254 elliptic curve used by Groth16, some efforts to integrate secp256k1 cost at least roughly 200.000 constraints, which is impractical for application requiring proof generation on the client side.

To get around it, we have to use a friendly elliptic curve with ZK-circuit, which forms a cycle with Bn254 (the coordination field of points on Bn254 is the scalar field of Baby Jubjub in vice versa ), called Baby Jubjub.

Based on Baby Jubjub, there’s also an alternative signature algorithm for ECDSA, which is proven to be more secure than the ZK-circuit and only consumes a modest amount of constraints ( roughly 5000) as well, EDDSA. This algorithm is used in many ZK-based protocols in Web3 for authentication, including ZkSync, thanks to its compatibility and security in the ZK-circuit.

PreviousIdentity SDK NextProtocol Design

Last updated 1 year ago