# zkEVM Halo2 GPU Prover

Snarkify's cuSnark library is a C++/CUDA project which provides a set of API functions, with rust bindings, designed to be plugged into the Halo2 proof system, replacing various compute-intensive operations with an accelerated GPU backend.

The following tables outline the performance improvements yielded when all of these optimizations are employed. The impact on end-to-end (e2e) proof time each optimization has depends on the size of the proof, which can be generally characterized by the number of rows and columns in the proof's trace table. Two proofs with different dimensions have been selected to demonstrate the variance in optimization impact. These benchmarks were obtained on a AMD EPYC 7702 64-Core Processor with 4x NVIDIA GeForce RTX 3090 (24 GB) GPUs.

### Proof 1 (aggregation): 2^25 rows, 5 columns

<table data-full-width="true"><thead><tr><th width="240">Proof Stage</th><th>CPU/s</th><th width="148">CPU e2e %</th><th width="129">GPU/s</th><th width="132">GPU e2e %</th><th>Speedup</th></tr></thead><tbody><tr><td>Initialization</td><td>1.40 </td><td>0.64 </td><td>1.40</td><td>4.63 </td><td>1.00 </td></tr><tr><td>Generate Instance</td><td>1.08 </td><td>0.50 </td><td>0.46</td><td>1.52 </td><td>2.35 </td></tr><tr><td>Generate Advice</td><td>6.99 </td><td>3.21 </td><td>2.73</td><td>9.03 </td><td>2.56 </td></tr><tr><td>Generate Lookups</td><td>2.22 </td><td>1.02 </td><td>1.88</td><td>6.22 </td><td>1.18 </td></tr><tr><td>Commit Permutations</td><td>24.01 </td><td>11.03 </td><td>10.72</td><td>35.45 </td><td>2.24 </td></tr><tr><td>Eval_h</td><td>67.40 </td><td>30.95 </td><td>6.30</td><td>20.83 </td><td>10.70 </td></tr><tr><td>Compute Evaluations</td><td>35.73 </td><td>16.41 </td><td>5.02</td><td>16.60 </td><td>7.12 </td></tr><tr><td>Multiopen</td><td>29.74</td><td>13.66</td><td>1.72</td><td>5.69</td><td>17.29 </td></tr><tr><td><strong>Total</strong></td><td><strong>217.76</strong></td><td></td><td><strong>30.24</strong></td><td></td><td><strong>7.20</strong></td></tr></tbody></table>

### Proof 2 (chunk\_inner): 2^20 rows, 1135 columns

<table data-full-width="true"><thead><tr><th width="246">Proof Stage</th><th>CPU/s</th><th width="138">CPU e2e %</th><th width="99">GPU/s</th><th width="199">GPU e2e %</th><th>Speedup</th></tr></thead><tbody><tr><td>Initialization</td><td>6.15 </td><td>0.35 </td><td>6.11</td><td>1.32 </td><td>1.01 </td></tr><tr><td>Generate Instance</td><td>0.05 </td><td>0.00 </td><td>0.13</td><td>0.03 </td><td>0.38 </td></tr><tr><td>Generate Advice</td><td>393.58 </td><td>22.33 </td><td>306.44</td><td>66.17 </td><td>1.28 </td></tr><tr><td>Generate Lookups</td><td>59.63 </td><td>3.38 </td><td>56.84</td><td>12.27 </td><td>1.05 </td></tr><tr><td>Commit Permutations</td><td>152.79 </td><td>8.67 </td><td>42.27</td><td>9.13 </td><td>3.61 </td></tr><tr><td>Eval_h</td><td>1115.43 </td><td>63.28 </td><td>36.19</td><td>7.81 </td><td>30.82 </td></tr><tr><td>Compute Evaluations</td><td>10.22 </td><td>0.58 </td><td>7.60</td><td>1.64 </td><td>1.34 </td></tr><tr><td>Multiopen</td><td>24.90</td><td>1.41</td><td>7.56</td><td>1.63</td><td>3.29 </td></tr><tr><td><strong>Total</strong></td><td><strong>1762.75</strong></td><td></td><td><strong>463.13</strong></td><td></td><td><strong>3.81</strong></td></tr></tbody></table>

This document outlines the following GPU modules and the acceleration they provide for Halo2 proofs of various dimensions:

* [Multi-Scalar Multiplication (MSM)](/high-performance-zkp/zkevm-halo2-gpu-prover/msm.md)
* [Number Theoretic Transform (NTT)](/high-performance-zkp/zkevm-halo2-gpu-prover/ntt.md)
* [Polynomial Evaluation](/high-performance-zkp/zkevm-halo2-gpu-prover/quotient-polynomial-evaluation.md)
* [KZG Multiopen](/high-performance-zkp/zkevm-halo2-gpu-prover/kzg-multiopen.md)
* [Polynomial Inversion](/high-performance-zkp/zkevm-halo2-gpu-prover/polynomial-inversion.md)
* [Permutation Generation](/high-performance-zkp/zkevm-halo2-gpu-prover/permutation-generation.md)<br>


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.snarkify.io/high-performance-zkp/zkevm-halo2-gpu-prover.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
