zkalc's methodology
Disclaimer: zkalc aims to provide adequate results while being easy to use. Actual results will be different from our estimates.
In this page, we describe our benchmarking pipeline and how we derive our results. In short:
- For each supported operation, we run benchmarks that measure its performance. We use
criterion.rs
to take multiple samples, and then select the average. - We collect the benchmark results inside
frontend/data/
and make it freely available. - For each operation, we fit a function to its benchmark results. We use linear interpolation inside the benchmark bounds and least squares regression outside the benchmarking bounds.
- When a user queries zkalc for an operation of size , we estimate the its running time using the produced function.
We will now go deeper into the above process, also linking to the relevant places in the codebase.
Running benchmarks
Inside backend/
we collect benchmarks for all the supported public-key operations under different configurations. All benchmarks were run with multithreading enabled and with best settings available.
Consider this benchmark for the arkworks
library. At compile time, we enable the feature flags parallel,std,asm
. The microbenchmarks inside look a bit like the following:
Running make
inside backend/
will run every benchmark, storing results inside perf/
, where they are post-processed and collected in our API format.
Collecting benchmarks
We store our benchmarking results in JSON inside frontend/data
and they are free for everyone to use.
Given a curve key curve
(available in curves.json
), a library key lib
that supports the given curve (available in libraries.json
) and a machine key machine
(available in machines.json
), it is possible to fetch the samples in the file data/{curve}/{library}/{machine}.json
.
For instance, this is the file reporting the benchmarks for curve BLS12-381 using zkcrypto's pairing on an AWS m5.2xlarge instance:
Estimating the running time
Now we have benchmark data for every operation in the perf/
directory. The next step is to fit a function to every operation, so that when a user queries us for an operation with arbitrary size , we can answer it by evaluating .
For simple operations like basic scalar multiplication and field addition (which are not amortized) we consider them to be sequential computations. That is, if a single scalar multiplication takes seconds, such operations will take seconds. That results in a simple linear function .
More complicated operations like MSMs and pairing products are amortized and their performance doesn't follow a simple linear curve. For such operations, we collect benchmark data for various sizes. For example, consider the figure below which displays the benchmark data from a MSM operation for sizes from to (both axis are in log scale):
To answer user queries within the benchmark range, we perform polynomial interpolation over the benchmark data.
That is, for each pair of benchmark data and we trace the line that goes through both points. We end up with a piecewise function that covers the entire benchmark range, as we can see in the figure below:
For user queries outside of the benchmarking range, we extrapolate via non-linear least squares. To make things more exciting we decided that it should be done... in Javascript inside your browser.
Here is an example of the extrapolation behavior in MSM outside of the benchmarking range (that is, after ):
In the specific case of MSMs, Pippenger's complexity is well known to be asymptotically . Hence, we use least squares to fit the data set to a function solving for .
We do not expect extrapolation to faithfully follow the benchmarks. We believe however that the estimates provide a rough idea of how long an algorithm will take. In addition, it's worth pointing out while we try to provide useful results for arbitrary operation sizes, we cannot guarantee that the requested machine will be able to handle the requested size. For example, a commodity laptop will likely run out of memory before it computes an MSM of size .
In the end of this process, we end up with a piecewise function for each operation that we can query inside and outside the benchmarking range to answer user queries.
Hazmat
We provide an estimator function for everyone to use inside the browser's console. Just open Developer Tools,
and check out the function estimator
.
There is still lots of ways we can improve zkalc. If you want to hack around, and you don't generally like to open the browser console when a stranger says so, check the TODO file to see how you can also help!