zkEVM Halo2 GPU Prover

Snarkify's cuSnark library is a C++/CUDA project which provides a set of API functions, with rust bindings, designed to be plugged into the Halo2 proof system, replacing various compute-intensive operations with an accelerated GPU backend.

The following tables outline the performance improvements yielded when all of these optimizations are employed. The impact on end-to-end (e2e) proof time each optimization has depends on the size of the proof, which can be generally characterized by the number of rows and columns in the proof's trace table. Two proofs with different dimensions have been selected to demonstrate the variance in optimization impact. These benchmarks were obtained on a AMD EPYC 7702 64-Core Processor with 4x NVIDIA GeForce RTX 3090 (24 GB) GPUs.

Proof 1 (aggregation): 2^25 rows, 5 columns

Proof Stage
CPU/s
CPU e2e %
GPU/s
GPU e2e %
Speedup

Initialization

1.40

0.64

1.40

4.63

1.00

Generate Instance

1.08

0.50

0.46

1.52

2.35

Generate Advice

6.99

3.21

2.73

9.03

2.56

Generate Lookups

2.22

1.02

1.88

6.22

1.18

Commit Permutations

24.01

11.03

10.72

35.45

2.24

Eval_h

67.40

30.95

6.30

20.83

10.70

Compute Evaluations

35.73

16.41

5.02

16.60

7.12

Multiopen

29.74

13.66

1.72

5.69

17.29

Total

217.76

30.24

7.20

Proof 2 (chunk_inner): 2^20 rows, 1135 columns

Proof Stage
CPU/s
CPU e2e %
GPU/s
GPU e2e %
Speedup

Initialization

6.15

0.35

6.11

1.32

1.01

Generate Instance

0.05

0.00

0.13

0.03

0.38

Generate Advice

393.58

22.33

306.44

66.17

1.28

Generate Lookups

59.63

3.38

56.84

12.27

1.05

Commit Permutations

152.79

8.67

42.27

9.13

3.61

Eval_h

1115.43

63.28

36.19

7.81

30.82

Compute Evaluations

10.22

0.58

7.60

1.64

1.34

Multiopen

24.90

1.41

7.56

1.63

3.29

Total

1762.75

463.13

3.81

This document outlines the following GPU modules and the acceleration they provide for Halo2 proofs of various dimensions:

Last updated