Polynomial Inversion

Overview

At multiple points in the Halo2 proof generation process, large sets of finite field elements must be inverted, which can be a significant computational expense.

GPU Implementation

These inversions can be trivially implemented on the GPU by simply computing the inverse of each element concurrently.

Performance

The following table outlines the performance improvements yielded with just the cuSnark polynomial inversion employed, showing the (CPU baseline) and accelerated results for the different proof stages in a set of proofs of various sizes. Times are in seconds, obtained on a AMD EPYC 7702 64-Core Processor with 4x NVIDIA GeForce RTX 3090 (24 GB) GPUs.

Proof Stage
2^20 rows, 1135 columns
2^25 rows, 5 columns

Initialization

(6.04) 6.12

(1.40) 1.41

Generate Instance

(0.05) 0.05

(1.05) 1.08

Generate Advice

(381.78) 355.14

(6.68) 5.53

Generate Lookups

(57.99) 57.92

(2.10) 2.12

Commit Permutations

(146.59) 145.17

(23.72) 23.58

Eval_h

(1069.09) 1070.33

(66.58) 66.70

Compute Evaluations

(9.81) 9.85

(35.79) 35.39

Multiopen

(18.71) 18.93

(28.82) 28.69

Total

(1690.05) 1663.51

(166.15) 164.48

Last updated