Polynomial Inversion
Overview
At multiple points in the Halo2 proof generation process, large sets of finite field elements must be inverted, which can be a significant computational expense.
GPU Implementation
These inversions can be trivially implemented on the GPU by simply computing the inverse of each element concurrently.
Performance
The following table outlines the performance improvements yielded with just the cuSnark polynomial inversion employed, showing the (CPU baseline) and accelerated results for the different proof stages in a set of proofs of various sizes. Times are in seconds, obtained on a AMD EPYC 7702 64-Core Processor with 4x NVIDIA GeForce RTX 3090 (24 GB) GPUs.
Initialization
(6.04) 6.12
(1.40) 1.41
Generate Instance
(0.05) 0.05
(1.05) 1.08
Generate Advice
(381.78) 355.14
(6.68) 5.53
Generate Lookups
(57.99) 57.92
(2.10) 2.12
Commit Permutations
(146.59) 145.17
(23.72) 23.58
Eval_h
(1069.09) 1070.33
(66.58) 66.70
Compute Evaluations
(9.81) 9.85
(35.79) 35.39
Multiopen
(18.71) 18.93
(28.82) 28.69
Total
(1690.05) 1663.51
(166.15) 164.48
Last updated