Snarks based on pairing curves take ten to a hundred time longer to prove than snarks based on bulletproofs based on ristretto25519
Pairing curve snarks are much smaller, \bigcirc(1) size and much faster to verify, while bulletproofs are \bigcirc(\log(n))
So if one has a single snark that must be evaluated by everyone that proves the entire blockchain, the primary cost is proving, not verifying, so bulletproofs over ristretto25519 are the way to go.
In a blockchain with a single snark at the root of each block, one is proving verification of a proof. Verifying a snark within a snark is an enormously costly operation. Folding is much faster.
Pairing curves are better for a blockchain containing an enormous number of snarks, each of them proved by one person and veified by everyone.
A horde of libraries are rapidly appearing on GitHub, most of which have stupendously slow performance, can only generate proofs for absolutely trivial things, and take a very long time to do so.
Blog full of the latest hot stuff
[An Analysis of Polynomial Commitment Schemes: KZG10, IPA, FRI, and DARKS] (https://medium.com/@ola_zkzkvm/an-analysis-of-polynomial-commitment-schemes-kzg10-ipa-fri-and-darks-a8f806bd3e12){target=“_blank”}
Inner Product Arguments A basic explanation of polynomial commitments on ordinary curves
Verification is linear in the length of the polynomial, and logarithmic in the number of polynomials, so you want a commitment that is to quite a lot of short fixed length polynomials. All the polynomials are of the same fixed length defined by the protocol, but the number of polynomials can be variable. Halo 2 can reference relative fields – you can have a proof that a value committed by N polynomials bears some relationship to the value in polynomial N+d
A whole lot of pairing curves have fallen to recent attacks.
The most efficient pairing curves still standing at the time that this paper was written are the BLS 12-381 curves. (126 bits security)
ZCash, Ethereum, Chia Netork, and Algorand, have all gone with BLS 12-381, so these probably have the best developed libraries.
All the pairing curves are composite order, and there is no equivalent of ristretto to create a prime order curve out of them.
Instead, one must always check each point to make sure it is of the correct subgroup.
Faster Subgroup Checks for BLS12-381
Subgroup security in pairing-based cryptography
The IETF pairing curve paper has a list of libraries
Polynomial commitments on a pairing curve are a cryptographic primitive used in advanced protocols like zero-knowledge proofs (ZKPs), verifiable computation, and succinct arguments of knowledge (SNARKs). They allow a prover to commit to a polynomial and prove that the polynomial evaluates to certain values at specific points, or equivalently that the polynomial committed to has certain constraints, without revealing the polynomial itself.
So you represent the thing you want to commit to as a polynomial, and you can prove the thing you want to commit to has certain properties, without revealing other properties. The intended usage is that the blockchain becomes a verkle tree rather than a merkle tree, with polynomial commits replacing hashes, you can commit transactions to a blockchain, and prove that they are valid and legitimate commits, without piling huge amounts of data on the blockchain, without revealing everything to everyone, and without every peer having to receive, evaluate, and store, every transaction.
A polynomial P(x) of degree d is represented as: P(x)=c_0+c_1x+c_2x^2+⋯+c_dx^d
The coefficients c_0, c_1, …, c_d, are scalars, integers modulo q, where q is the order of the prime subgroup of the elliptic points that we will be using. q is a 381 bit prime, and the scalars are very large integers modulo that prime.
q=0x73EDA753299D7D483339D80809A1D80553BDA402FFFE5BFEFFFFFFFF00000001, which means we can use the number theoretic transformation on polynomials of order 2^{n}-1 for polynomials of order 1, 3, 7, 15, 31, \dots, 2^{32}-1. For polynomials that have 2^n coefficients, we can switch between representing a polynomial in terms of its coefficients, to representing it in terms of its values \large P\bigg((-1)^\frac{0}{2^n}\bigg),\;P\bigg((-1)^\frac{1}{2^n}\bigg),\;P\bigg((-1)^\frac{2}{2^n}\bigg),\;\dots\;P\bigg((-1)^\frac{2^n-1}{2^n}\bigg) In this representation as the polynomial value at the roots of minus one we can multiply and divide polynomials in \bigcirc(2^n) operations, while in the coefficient representation, it would require \bigcirc(2^{2n}) operations. We can also convert to and from the coefficient representation in \bigcirc(n2^n) operations.
If we chose some other set of points, for example 1, 2, \dots, N, we would have to use the langrange transform, rather than the number theoretic transform, which is slow.
In notation likely to be more intelligible to my engineer audience,
using the notation of a typed computer language, rather than the
notation of mathematics:
if x is of type 𝔽q, y is of type ℤq2,
P is of type G_1, Q is of type G_2
(x*P)×(y*Q)= x*y*(P×Q)
But, since you cannot divide by P, nor Q, nor P×Q, this comes in handy when you want to
prove something about a commit, without revealing the contents of the commit. You, the prover, know
x and y, everyone knows P and Q, and you show the verifier x*P and y*Q.
And just as a one hundred and sixty bit git hash can commit an arbitrarily large amount of data, a three hundred and eighty four bit polynomial commit, a point in G_1, a value of type G_1, can commit an arbitrarily large amount of data, and unlike a git hash, a three hundred and eighty four bit snark, also a point in G_1, a value of type G_1, can verify a proof of something arbitrarily complex about that arbitrarily large amount of data.
Thus for example a litecoin mweb utxo proves that it was the output of a valid transaction which was formed from valid inputs, and all its inputs similarly, all the way back to the the beginning. (Unfortunately it does not prove no double spends, this being considerably harder. I address how to do that elsewhere.)
How to do a polynomial commit:
This does the same thing as a git commit, except it takes enormously longer to compute, and takes up more than three times as much space.
No one would every use them, except that you can prove things about what was committed without revealing the commit, and without the verifier having to do the potentially enormous work of verifying the potentially enormous commit. Which is handy if the commit is a commit of a gigantic blockchain which is eating up ever larger disks and takes ever longer to download and verify.
The huge advantage of polynomial commits over hashes is that you can prove things about what is being committed, without needing to reveal what is committed, and thus without everyone needing to receive, evaluate, and store everything committed.
We want to prove that C(z_i)=0 for many i.
Instead of calculating the quotient polynomial Q(x)=\frac{C(x)}{x-z} as above, we instead calculate the quotient polynomial \displaystyle Q(x)=\frac{C(x)}{\prod_j{(x-z_j)}}
To prove with a zk-snark, we express the requirements on the data as an R1CS. R1CS stands for rank one constraint system. By introducing sufficient intermediate variables, often an enormous number of intermediate variables we can convert a vary broad range of constraints into a collection of rank one constraints, a rank one constraint system.
QAP stands for Quadratic Arithmetic program, and we are going to convert a rank one constraint system in a quadratic arithmetic program. Convert and R1CS to a QAP.
And then prove that the polynomial committed satisfies a quadratic arithmetic program, and thus represents data that satisfies A a rank one constraint system, and thus satisfies whatever constraint the verifier insists on, without revealing anything about what is committed other than it does satisfy that constraint, and without the commitment taking up a lot of space.
This explanation is for the most part lifted from the qap post of the zero knowledge blog
We do this by R1CS (if we want to prove by a snark. If want to prove by folding, we use relaxed R1CS) We have to express the requirements on the values that we are not necessarily revealing as quadratic constraints. Which is apt to require a huge number of complicated and unobvious quadratic constraints to express something very simple.
\vec{w} is the witness, the set of values we want to prove something about. A rank one constraint is (\vec{a}.\vec{w})(\vec{b}.\vec{w})-\vec{c}.\vec{w}=0. where (\vec{a}, \vec{b} and \vec{c} are the actual constraint, known to both prover and verifier, and the same and unchanging for many different values of \vec{w} that the prover wants to prove something about. To express our actual constraints as quadratic constraints, we usually have to introduce a lot of intermediate values and a lot of quadratic constraints.
Thus for example, suppose our constraint is 0=w_2^3-w_1
This not a rank one constraint, so we introduce two extra variables, w_0, which by convention is always equal to one and w_3, which we will define by the extra constraints 0=w_0^2-w_0 and 0=w_2^2-w_3
Whereupon our original constraint becomes the rank one constraint
0=w_2*w_3-w_1
And our witness becomes
\vec{w}=\begin{bmatrix} 1&& w_1 && w_2 &&w_2^2\end{bmatrix}
We now have a rank one constraint system composed of three rank 1 constraints,
an array rather than vectors, albeit
we cannot do matrix operations, so mathematicians are not going to call It
a matrix but engineers get to call it an array.
0=(\vec{a_0}.\vec{w})(\vec{b_0}.\vec{w})-\vec{c_0}.\vec{w} \\
0=(\vec{a_1}.\vec{w})(\vec{b_1}.\vec{w})-\vec{c_1}.\vec{w} \\
0=(\vec{a_2}.\vec{w})(\vec{b_2}.\vec{w})-\vec{c_2}.\vec{w}
\begin{bmatrix} \vec{a_0} \\ \vec{a_1} \\ \vec{a_2} \end{bmatrix}=
\begin{bmatrix} 0&&0&&1&&0 \\ 1&&0&&0&&0\\ 0&&0&&0&&1 \end{bmatrix}
\begin{bmatrix} \vec{b_0} \\ \vec{b_1} \\ \vec{b_2} \end{bmatrix}=
\begin{bmatrix} 0&&0&&0&&1 \\ 1&&0&&0&&0\\ 0&&0&&0&&1 \end{bmatrix}
\begin{bmatrix} \vec{c_0} \\ \vec{c_1} \\ \vec{c_2} \end{bmatrix}=
\begin{bmatrix} 0&&1&&0&&0 \\ 1&&0&&0&&0\\ 0&&0&&1&&0 \end{bmatrix}
We are now going to convert them into polynomials, into a quadratic arithmetic program, and which is going to make them enormously bigger, but with polynomials, we can commit to, prove, and verify these gigantic constraint systems with snarks and polynomial commits.
We expand our witness \vec{w} with zeroes, and our constraints with constraints that force the corresponding elements of the witness to be zero, so that our witness has 2^n elements. In our toy example we do not need to do this, because it already has four, 2^2, elements.
We interpret our vectors as polynomials represented by their values at the roots of minus one in 𝔽q. P(ζ^0),\;P(ζ^1),\;P(ζ^2),\;P(ζ^3) where ζ^2=-1 in 𝔽q. Or as an engineer would tell it, ζ is a value of type 𝔽q, the integers modulo q, such that ζ^2=q-1.
If, on the other hand, we had 2^{20} constraints, we would be using ζ such that ζ^{19}=q, which is why I denoted it as ζ rather than i. Also i is misleading, for it is not a complex number, it an ordinary value of type 𝔽q, an ordinary (though very large) integer modulo q.
We have to expand the rank one constraints to be a power of two, and in our toy example we have only three constraints, so we add an extra zero to the witness, and an extra constraint to prove that that value is zero, so we now have four constraints, one of which is trivial.
We now generate four polynomials A(x), B(x), C(x), and T(x) such that:
A(ζ^j)=\vec{a_j}.\vec{w} \\ B(ζ^j)=\vec{b_j}.\vec{w} \\ C(ζ^j)=\vec{c_j}.\vec{w} \\ T(ζ^j)=0
for as many j as we have rank one constraints.
We then have to convert each one to coefficient form. (Even need evaluation form because we intend to go on using the number theoretic transform to do polynomial multiplication and division) we have convert them to coefficient form to interpolate the extra points needed, because A(x)B(x) has twice as many coefficients as we have constraints, so we are going to need twice as many polynomial values at roots of minus one.)
For the prover to prove to the verifier that he has \vec{w}, which satisfies the rank one constraint system, he has has to prove that the polynomial A(x)*B(x)-C(x) has zeroes at ζ^j
And we can prove this as explained in Proving many zeroes of a single polynomial
The solution vector \vec{w} for the R1CS satisfies the R!CS if there exists a polynomial H(x) such that A(x)⋅B(x)−C(x)=H(x)⋅T(x)
So the prover has to construct, in coefficient form, the polynomial H(x)=\frac{A(x)*B(x)-C(x)}{T(x)}
It is a lot faster to transform the coefficient forms of into evaluation form, a list of values of the polynomial evaluated at certain poins form using the number theoretic transform, with twice as many roots of minus one this time, and then do the multiplication and division on list of points form, then convert back to coefficient form so that we can use the SRS.
Dory is also concretely efficient: Using one core and setting n = 2^{20}, commitments are 192 bytes. Evaluation proofs are 18 kB, requiring 3 s to generate and 25 ms to verify. For batches at n = 2^{20}, the marginal cost per evaluation is <1 kB communication, 300 ms for the Prover and 1 ms for the Verifier.
Seems to generate a verkle tree of polynomial commits (a verkle trie being the polynomial commitment equivalent of a merkle trie, and prove something about the preimage of the root – which sounds like exactly what doctor ordered. You prove the preimage of each vertex on the path, and then prove the things about leaf part of the pre-image.
Nova folds polynomial commits, and the accompanying proofs that the polynomial commits have the desired properties, into a single polynmomial and a single proof, directly, rather than folding two snarks into one snark. Which means that the party doing the fold sees all, but can do the folding much faster.
The folded proof has to contain an additional proof that the folding was done correctly.
Plonk, or Groff16 can be used as a proof system inside Nova. Recursion is a very low cost, but its native language is inherently relaxed R1CS and plonkish and Groff16 gets translated back and forth to relaxedR1CS
Nova claims to be fast, is being frequently updated, needs no trusted setup, and other people are writing toy programs using Nova. You tube videos report some real programs going into Nova – to reduce the horrific cost of snarks and recursive snarks.
Nova can be used, is intended to be used, and is being used as a preprocessing step to give you the best possible snark, but should be considered as standalone, as mimblewimble used polynomial commits alone.
And, using pairing based cryptography, a polynomial commit takes 384 bits.
And, at the root of one’s trie, one has polynomial commit and a proof of the topmost vertex – so not zero knowledge for that vertex, but the top most vertices do not contain any secret information. So Nova can do everything we need without actually using snarks.
Nova can generate a trie of polynomial commits, which function as hashes, and (using painfully slow pairing based cryptographys)
The standard usage is incrementally verifiable computation, a linear chain, but to get full trie computation, you have man instances doing the heavy lifting, which communicate by “proof carrying data”
You tube video says nova, bulletproofs modified from R1CS to relaxed R1CS, and we can have a trie of provers. Well, if we have a trie of provers, why should not anyone who wants to inject a transaction be a prover? And if everyone is a prover, we need no snarks.
Everyone shares proving that he alone has done and only he needs encrypted a form that only he can read among thirty two or so neighbors, and similarly stuff that only two of them can read, only four of them can read, in case he crashes, goes offline, and loses his state.
Nova requires the vm to be standardized and repetitious, proving in some sense the same thing for itself and its children.
Plonky had a special purpose hash, such that it was easy to produce recursive proofs about Merkle trees. I don’t know if Nova can do hashes with useful speed, or hashes at all, without which no recursive snark system is useful. But a polynomial commitment accomplishes the same thing as a hash, just much costlier to evaluate.
Nova is built out of commitments, which are about 256 times bigger than a hash. A Nova proof is a proof about a merkle tree of commitments. If we build our blockchain out of Nova commitments, it will about couple of hundred times larger than one built out of regular hashes, but will still only occupy about ten or twenty gigabytes of storage. Bandwidth limits restrict us to about twenty transactions a second, which is still faster than the bitcoin blockchain. Plus, when we hit ten or twenty transactions per second, we can shard the blockchain, which we can do because each shard can prove it is telling the truth about transactions, whereas with bitcoin, every peer has to evaluate every transaction, lest one shard conspire to cheat the others.
Nova does not appear to have a language. Representing a proof system as a Turing machine just seems like a bad idea. It is not a Turing machine. You don’t calculate a=b*c you instead prove that a=b*c, when you already somehow knew a, b, and c. A Turing machine is a state machine. A proof system is not.
It is often said, and is in a sense true, that prover produces a proof that for a given computation he knows an input such that after a correct execution of the computation he obtains a certain public output.
But no he his not. The proof system proves that relationships hold between values. And because it can only prove certain rather arcane and special things about relationships between values, you have to compute a very large number of intermediate values such that the relationship you actually want to prove between the input and the output corresponds to simple relationships between these intermediate values. But computing those intermediate values belongs in an another language, such as C++ or rust.
With Nova, we would get an algorithm such that you start out with your real input. You create a bunch of intermediate values in a standard language like C++ or rust, then you call the proof system to produce a data structure that can be used to prove relationships between your input and those intermediate values. Then you produce the next set of intermediate values, call your proof system to produce a data structure that can be used to prove the next set of relationships, fold those two proof generating data structures together, rinse and repeat, and at the end you generate a proof that the set of relationships the fold represents is valid. That is procedural, but expressing the relationships is not. Since your fold is the size of the largest hamiltonian circuit so far, you want the steps to be all of similar size.
Halo 2 is a general purpose snark library created for ZCash, replacing their earlier library and using a different curve. It directly supports performing an SHA2 hash inside the proof and verification. I don’t know how fast that is, and I did not immediately find any examples recursing over an SHA merkle tree and merkle chain.
This suggests a functional language (sql). There are, in reality, no purely functional languages for Turing machines. Haskell has its monads, sql has update, insert, and delete. But the natural implementation for a proof system would be a truly purely functional language, an sql without update, insert, or delete, without any operations that actually wrote anything to memory or disk, that simply defined relationships without a state machine that changes state to write data into memory consistent with those changes.
The proof script has to be intellible, and the same for prover and verifier, the difference being that the prover interleaves the proof language with ordinary code in ordinary language, to produce the values that are going to be proven. The prover drives the script along with ordinary language code, and verifier drives it along with different ordinary language code, but the proof definition that is common to both of them has no concept of being sequential and driven along, no concept that things are done in any particular order. It is a graph of relationships. The proof language, as is typical of purely functional languages, should consist of assertions of about relationships between immutable data structures, without expressing the idea that some of these data structures were created at one time, and destroyed at another. Some of these values are defined recursively, which means that what is actually going to happen in practice is that they are going to be created by a loop, written in the ordinary procedural language such as Rust or C++, but the proof language should have no concept of that.
if the proof language asserts that 1 \leq 0 \lor n<20 \implies f(n-1)= g(f(n)), the ordinary procedural language will likely need to generate the values of $f(n) for n=1 to 19, and will need to cause the proof language to generate proofs for each value of n for 1 to 19, but the resulting proof will be independent of the order in which these these proofs were generated.
Purely functional languages like sql do not prescribe an algorithm, and need a code generator that has to make guesses about what a good algorithm would be, as exemplified by sqlite’s likelihood, likely, and unlikely no-ops.
And with a proof system we will, at least at first, have the human make the algorithm but if he changes the algorithm, while leaving the proof system language unchanged, will still work.
The open source nature of Plonky is … complicated. The repository on Github has been frozen for two years, so likely does not represent the good stuff.
The most advanced, and most useful for blockchains, zk-snark technology is polygon, which claims to have finally found the holy grail: the actually useful generation and verification of proofs of verification.
So that Bob can not only verify that Ann’s information is what she says it is without knowing that information, Carol can verify that Bob verified, and Dave can verify that Carol verified it.
Which gives us scaling. Bob can verify that several people’s transactions are valid, Carol can verify several Bobs, and Dave can verify several Carols.
I have seen no end of claims that zk-snark system can do so and so, when, though it can in principle do so and so, actually getting it to do so and so is very hard and they have not quite managed to get it quite working, or they have actually gotten it to work but there are a bunch of complicated gotchas that make it impractical, or unwise, or not very useful to do so and so.
But I have also seen a great deal of real progress in solving these problems, albeit the progress tends to be overpromised and underdelivered, but the for all that, the progress is real and substantial.
Supposedly there is a language, R1CS, such that you can express a program that gives a true false answer, such that Aurora can execute the program and generate a prover and a verifier.
According to starkware, they have the fastest proving time, but their proofs are rather large, 138KiB, Groth16 Snarks have the most compact proofs.
Not actually seeing it as a useful library yet that I could actually use, but more like a proof of principle that someone could build such a library.
To be actually useful, a zk-snark system needs to be a compiler, that
compiles a program written in what Starkware calls R1CS, and other
people are calling script, and generates two programs, a prover and a
verifier.
The prover operates on two blobs, the public blob and private blob, and
produces a boolean result, true or false, pass or fail, and a proof that it
did so.
The proof is approximately constant size, regardless of how much computation is required and regardless of how large the private blob was, but takes a very long time.
The verifier operates on the public blob and the proof, takes a short and approximately constant time to do so, regardless of how big the computation was, and regardless of how big the private data was and determines, with 2^(126) likelihood of error, what result the prover got.
But at present I get the impression that neither script nor R1CS have any real existence, though I have seen a script language that operates on a stack, and, though it has no variables, can dupe any item on the stack to the top of the stack. It seems to have only been ever used to generate one prover and one verifier, because actually creating the prover and verifier still required some coding by hand. Also lacked certain control structures.
At present, people seem to be writing the prover and the verifier by hand, a very difficult operation with a very high likelihood of bugs. The prover and the verifier do very simple tasks like proving the encoded inputs to a transaction are greater than or equal to the encoded outputs and that no numeric underflow or overflow occurred.
Another problem is that we would really like the public data to be the root hash of a merkle tree, and no one seems to have a script language that contains a useful hash function Stackware is built out of hash functions, but last time I looked, you could not call a hash function from R1CS.
We need a script language that can not merely add and subtract, but can also do hashes and elliptic point operations. zk-stark systems are built out of hashes and elliptic point operations, but it seems to be uphill trying to generate proofs that prove something about the results of hashes and elliptic point operations, making very difficult to produce a proof that a pile of proofs in the pre-image of a merkle tree have been verified. I suspect that a prover might take a very very long time to produce such a proof.
The proofs are succinct, in that you can prove something about a gigantic pile of data and the size of the proof and the time taken to verify scarcely grows - about 128 KiB, for the smallest that anyone would care about, to utterly gigantic proofs. But proof generation is not all that fast, and grows with the matter to be proven, so to be useful for utterly gigantic proofs, you would need to be able to distribute proof generation over an enormous multitude of untrusting shards. Which you can obviously do by proving a verification. Not sure how long it takes to produce a proof that a large number of proofs were verified.
What you want is to be able to prove that a final hash is the root of of an enormous merkle tree, some generalization of a Merkle-patricia tree, representing an immutable append only data structure consisting of a sequence of piles of transactions, and the state generated by these transactions, represents a valid branch of a chain of signatures, that the final state is correctly derived by applying the batch of transactions to the previous state.
And then you want to do this for states so enormous, and piles of transactions so enormous, that no one person has all of them.
And then you still have the problem of resolving forks.
You would like to have a blockchain of blockchains of blockchains, such that your state, and your transactions, are divided into a product of substates, with consensus on each substate advancing a bit ahead of the consensus on the combination of several substates, so that transactions within a substate finalize fast, but transactions between substates take longer. (because the number of forks of the product state is the product of the number of forks of each substate)
Each of the substates very quickly comes up with a proof that a transaction within a substate is valid and quickly comes up with consensus as to which fork everyone is on, but the proof for a transaction between substates is finalized quickly in the paying substate, and quickly affects the paying substate, but the transaction does not get included in the state that is a product of the receiving and paying substate for a while, does not get proven valid in the product substate for a while, and does not get included in the receiving substate till a bit after than it is included in the product substate, whereupon it is in due course quickly proven to be a valid addition of value to the receiving substate. So that the consensus problem remains manageable, we need insulation and delay between the states, so that the product state has its own pile of state, representing the delay between a transaction affecting a the payer factor state, and the transaction affecting the payee factor state. A transaction has no immediate affect. The payer mutable substate changes in a way reflecting the transaction block at the next block boundary. And that change then has effect on product mutable state at a subsequent product state block boundary, changing the shares possessed by the substate.
Which then has effect on the payee mutable substate at its next block boundary when the payee substate links back to the previous product state.
reaction.la gpg key 154588427F2709CD9D7146B01C99BB982002C39F
This work is licensed under the Creative Commons Attribution 4.0 International License.