This proposal creates a precompiled contract that performs NTT and Inverse NTT transformations. This provides a way to efficiently perform fast polynomial multiplication for post-quantum and STARK cryptography.
Motivation
With the recent advances in quantum computing, there are increased concerns for the quantum threat against Ethereum. Today ECDSA is the EOA signature algorithms, which is vulnerable to attacks by quantum computers. Efficient replacement algorithms use polynomial multiplication as the core operation. Once NTT and Inverse NTT are available, the remaining of the verification algorithm is trivial. Choosing to integrate NTT and InvNTT instead of a specific algorithm provides agility, as DILITHIUM or FALCON or any equivalent can be implemented with a modest cost from those operators. NTT is also of interest to speed-up STARK verifiers. This single operator would thus benefit to both the Ethereum scaling and post-quantum threat mitigation.
Specification
Constants
Name
Value
Comment
NTT_FW
TBD
precompile address
NTT_INV
TBD
precompile address
NTT_VECMULMOD
TBD
precompile address
NTT_VECADDMOD
TBD
precompile address
We introduce four separate precompiles to perform the following operations:
NTT_FW - to perform the forward NTT transformation (Negative wrap convolution) with a gas cost of 600 gas,
NTT_INV - to perform the inverse NTT transformation (Negative wrap convolution) with a gas cost of 600 gas,
NTT_VECMULMOD - to perform vectorized modular multiplication with a gas cost formula defined in the corresponding section,
NTT_VECADDMOD - to perform vectorized modular addition with a gas cost formula defined in the corresponding section.
Field parameters
The NTT_FW and NTT_INV are fully defined by the following set of parameters:
Let $R$ be a cyclotomic ring of the form $R=\mathbb F_q[X]/(X^n+1)$. In these notations,
$n$ is the degree and is a power of 2,
$\mathbb F_q$ is the prime field where $q \equiv 1 \mod 2n$,
$\omega$ is a $n$-th root of unity in $\mathbb F_q$,
$\psi$ is a $2n$-th root of unity in $\mathbb F_q$.
Any element $a \in R$ is a polynomial of degree at most $n-1$ with integer coefficients, written
as $a=\sum_{i=0}^{n-1} a_iX^i$
NTT_FW
The NTT transformation is described by the following algorithm.
Input: A vector $a = (a[0], a[1], \dots, a[n-1]) \in \mathbb F_q^n$ in standard order, where $q$ is a prime such that $q \equiv 1 \mod 2n$ and $n$ is a power of two, and a precomputed table $\Psi_\text{rev} \in \mathbb{Z}_q^n$ storing powers of $\psi$ in bit-reversed order.
Output: $a \leftarrow \text{NTT_FW}(a)$ in bit-reversed order.
t ← n
for m = 1 to n-1 by 2m do
t ← t / 2
for i = 0 to m-1 do
j1 ← 2 ⋅ i ⋅ t
j2 ← j1 + t - 1
S ← Ψrev[m + i]
for j = j1 to j2 do
U ← a[j]
V ← a[j + t] ⋅ S
a[j] ← (U + V) mod q
a[j + t] ← (U - V) mod q
end for
end for
end for
return a
NTT_INV
The Inverse NTT is described by the following algorithm.
Input: A vector $a = (a[0], a[1], \dots, a[n-1]) \in \mathbb F_q^n$ in bit-reversed order, where $q$ is a prime such that $q \equiv 1 \mod 2n$ and $n$ is a power of two, and a precomputed table $\Psi^{-1}_\text{rev} \in \mathbb F_q^n$ storing powers of $\psi^{-1}$ in bit-reversed order.
Output: $a \leftarrow \text{NTT_INV}(a)$ in standard order.
t ← 1
for m = n to 1 by m/2 do
j1 ← 0
h ← m / 2
for i = 0 to h-1 do
j2 ← j1 + t - 1
S ← Ψ⁻¹rev[h + i]
for j = j1 to j2 do
U ← a[j]
V ← a[j + t]
a[j] ← (U + V) mod q
a[j + t] ← (U - V) ⋅ S mod q
end for
j1 ← j1 + 2t
end for
t ← 2t
end for
for j = 0 to n-1 do
a[j] ← a[j] ⋅ n⁻¹ mod q
end for
return a
NTT_VECMULMOD
The NTT_VECMULMOD is functions similarly to SIMD, but operates with larger input and output sizes.
Input: Two vectors $a = (a[0], a[1], \dots, a[n-1]), b=(b[0], b[1], \dots, b[n-1]) \in \mathbb F_q^n$ where $n$ and $q$ are defined above.
Output: The element-wise product $(a[0]\cdot b[0] \mod q, a[1]\cdot b[1]\mod q, \dots, a[n-1]\cdot b[n-1] \mod q)$.
Gas cost: Denoting $k$ to be the smallest power of $2$ larger than $\log_2(q)$, the gas cost of this operation is $k\log_2(n) / 8$.
NTT_VECADDMOD
The NTT_VECMULMOD is similar to SIMD in the functioning, but operates with larger sizes in input and output.
Input: Two vectors $a = (a[0], a[1], \dots, a[n-1]), b=(b[0], b[1], \dots, b[n-1]) \in \mathbb F_q^n$ where $n$ and $q$ are defined above.
Output: The element-wise addition $(a[0]+ b[0] \mod q, a[1]+ b[1]\mod q, \dots, a[n-1]+ b[n-1] \mod q)$.
Gas cost: Denoting $k$ to be the smallest power of $2$ larger than $\log_2(q)$, the gas cost of this operation is $k\log_2(n) /32$.
Rationale
If $f$ and $g$ are two polynomials of $R$, then
$f\times g= \text{NTT_INV}(\text{NTT_VECMULMOD}(
\text{NTT_FW}(a), \text{NTT_FW}(b)))$ is equal to the product of $f$ and $g$ in $R$. The algorithm has a complexity of $n \log_2n$ rather than $n^2$ with the classical schoolbook multiplication algorithm.
Gas Cost Analysis
The gas cost model for EIP-7885 precompiles is designed to target approximately 50-55 mgas/s performance, consistent with existing precompiled contracts like ECRECOVER. Two implementations with different gas costs are available:
NTT Precompiles (0x12, 0x13) - Scheme-Specific Cost Model
Gas Costs by Implementation:
Scheme
Ring Degree
nocgo NTT_FW
nocgo NTT_INV
cgo NTT_FW
cgo NTT_INV
Falcon-512
512
790 gas
790 gas
500 gas
500 gas
Falcon-1024
1024
1,750 gas
1,750 gas
1,080 gas
1,080 gas
ML-DSA (Dilithium)
256
220 gas
270 gas
256 gas
340 gas
Pure Go (nocgo): Zero-dependency implementation.
liboqs (cgo): Offers 36-38% lower gas costs for NTT operations.
Rationale: Gas costs are calibrated per cryptographic scheme based on actual execution performance. NTT computation complexity varies by ring degree and modulus size, with costs optimized to maintain consistent throughput across different post-quantum schemes.
Vector Operations (0x14, 0x15) - Dynamic Cost Model
VECMULMOD Gas Cost: ceil(0.32 × N)
VECADDMOD Gas Cost: ceil(0.3 × N)
Cost Components:
Linear scaling with vector length N (ring degree)
VECMULMOD: 0.32 gas per element
VECADDMOD: 0.3 gas per element
Benchmark Validation: The dynamic cost model achieves target performance:
The gas cost formulas accurately reflect actual execution costs while maintaining ~50-55 mgas/s performance target across all tested cryptographic standards.
Fields of interest
The implementation applies for many fields of interest for cryptography. In particular, the design applies for:
FALCON: $q=3 \cdot 2^{12}+1$ (one of the NIST winners for post-quantum signature scheme),
DILITHIUM: $q=2^{23}-2^{13}+1$ (one of the NIST winners for post-quantum signature scheme),
KYBER: $q=13 \cdot 2^8+1$ (one of the NIST winners for post-quantum key encapsulation mechanism),
To illustrate the interest of the precompile, the assets provide the measured gas const for a single NTT and extrapolates the minimal gas cost taking into account the required number of NTT_FW and NTT_INV. The provided assets use pure Yul optimizations, with memory access hacks. It is unlikely that more than one order of magnitude could be spared on such a minimal code.
Use case
Parameters
single NTT gas cost
Required NTT(FW/INV)
Estimated NTT/Full cost
Falcon
$q=12289, n=512$
1.8 M
1 NTTFW+1 NTTINV
3.6 M
Dilithium
$q=2^{23}-2^{13}+1, n=256$
460K
4 NTTFW +1 NTTINV
2.3M
Falcon cost has been measured over a full implementation and is compliant to the estimation. Dilithium cost is evaluated assuming
This demonstrates that using pure solidity enables L2s with low gas fees to experiment with FALCON in the short term, whereas it is too expensive to do so on L1.
Adopting this EIP, the signature verification of Falcon can be reduced to 1500 gas, and a similar result is expected for Dilithium.
Adopting the hash function as a separate EIP would enable a gas verification cost of 2000 gas.
This is in line with the ratio looking at SUPERCOP implementations.
Native Client Implementation
Two implementations have been developed for op-geth with four precompiled contracts at addresses 0x12, 0x13, 0x14, and 0x15:
The benchmarks demonstrate that all NTT precompiles maintain competitive or better throughput compared to existing precompiled contracts like ECRECOVER, processing approximately 52-56 million gas per second. Both implementations (nocgo and cgo) provide efficient support for multiple post-quantum cryptographic schemes including Falcon-512/1024 and ML-DSA/Dilithium, with the liboqs variant offering ~36-38% lower gas costs for NTT operations.
End-to-End Signature Verification with Precompiles
Integration testing demonstrates the practical impact of NTT precompiles on full signature verification. Tests were conducted using modified ZKNOX implementations (ETHFALCON and ETHDILITHIUM) against an op-geth client with NTT precompile support.
Test Environment: OP-Stack testnet with NTT precompiles enabled.
Algorithm
Verification Method
Pure Solidity
With Precompile
Gas Saved
ETHFALCON (Falcon-512)
Direct
1,800,000
479,341
1,320,659 (73.4%)
ETHDILITHIUM (ML-DSA)
PKContract-based
9,746,410
7,618,412
2,127,998
ETHDILITHIUM (ML-DSA)
Direct (struct)
7,874,354
5,732,354
2,142,000
Key Findings:
ETHFALCON: Achieves 73.4% gas reduction with NTT precompiles, reducing signature verification cost from 1.8M gas to under 480K gas.
ETHDILITHIUM: NTT precompiles save over 2.1M gas per signature verification.
These results validate that NTT precompiles significantly reduce gas costs for post-quantum signature verification, making lattice-based cryptography more practical for Ethereum applications.
Backwards Compatibility
There are no backward compatibility questions.
Test Cases
The reference implementation includes comprehensive test coverage across multiple layers:
NTT Precompile Tests (0x12)
Malformed Input Tests - 8 test cases covering:
Invalid operation codes (values other than 0 or 1)
Invalid ring degrees (non-power-of-2, < 16)
Zero or non-NTT-friendly moduli (not congruent to 1 mod 2N)
Coefficients exceeding modulus bounds
Input length mismatches
Functional Tests:
Forward NTT transformation with ring degree 16, modulus 97
Inverse NTT transformation ensuring INTT(NTT(x)) = x
Round-trip validation for data integrity
Crypto Standards Benchmarks:
Falcon-512: n=512, q=12289
Kyber-128: n=256, q=3329
Dilithium-256: n=256, q=8380417
Vector Operations Tests (0x13, 0x14)
Unified Malformed Input Tests - 7 test cases covering:
Invalid ring degrees for both VECMULMOD and VECADDMOD
Zero or non-NTT-friendly moduli
Input length mismatches (expecting 2*N vectors)
Coefficient bounds validation
Functional Tests:
Element-wise multiplication: result[i] = (a[i] * b[i]) mod q
Element-wise addition: result[i] = (a[i] + b[i]) mod q
Validation with small test vectors (ring degree 16, modulus 97)
All implementations have been validated over a large base of reference vectors, and implementing both FALCON and DILITHIUM algorithms as demonstration of the usefulness of the precompile.