Lean4 formalization of

6 min read Original article ↗

How to read a cryptography paper if you are not an expert

Alice writes a cryptography paper. Charlie is not an expert. How can Charlie be sure that the paper is okay?

Alice can ask an expert Bob to review her paper. Charlie can rely on Bob's reputation to get some indirect confidence on Alice's paper. Can we do better?

There's a tool called Lean. Unlike Bob, Lean is not an expert. But it keeps asking questions until it internally completes a rigorous, small-stepped mathematical proof. The result can be published on GitHub. Usually it's called a Lean proof, or Lean formalization of the proof.

So, with the help of Lean, perhaps Charlie doesn't need to depend completely on the reputation of Bob.

What was formalized

Last October, my colleague Nico posted here Why Does FRI Work?. The post described the essence of the paper he wrote with his coauthors:

"A Simplified Round-by-round Soundness Proof of FRI" by Albert Garreta (Nethermind Research), Nicolas Mohnblatt (zkSecurity), and Benedikt Wagner (Ethereum Foundation).

Now it's formalized in Lean, except one result cited from WHIR paper about mutually correlated argument. The formalization was largely done with Harmonic's Aristotle and Claude Code.

The repository contains the Lean code for the definition and the theorems in the paper, including the proofs. The Lean type checker makes sure that the theorems are provable. The only possible room for error is misformalization: there's a possibility that the Lean statements do not exactly correspond to the statements in the paper. I need to run some examples in Lean and on paper to be completely sure that the formalization is correct. Instead I just compared the Lean code against the original paper. There are some non-critical differences between the paper and the formalization.

  • The formalization added a condition saying that the initial degree is bigger than zero. This is then used to show that the evaluation domains are nonempty. I am not sure whether this additional assumption is really needed. It doesn't matter in practice.
  • The formalization proves perfect completeness, which was not the main topic of the paper. The completeness result gives me some confidence that the formalization didn't accidentally create a verifier that refuses everything.
  • The definition of mutual correlated argument has been updated and getting stabilized over time. The formalization is based on a snapshot of the paper as of December 2025. I'm monitoring the authors' updates as well as Arklib's choices.

To be honest I wasn't expecting to find serious flaws. Indeed I didn't find any. The paper's acknowledgment mentions that Ariel Gabizon caught a mistake in an earlier draft. That gave me strong confidence in the correctness of the paper.

Lean formalization has an advantage that the correctness of the proofs is not dependent on somebody's rigor. Also, this paper formalization can be used as a specification for verifier implementations.

How it went

ItaLean 2025

I attended a workshop called ItaLean 2025. Shortly before the workshop, there were some announcements about papers getting formalized using AI agents. I asked Nico for suggestions to be formalized. He suggested his paper about FRI, so I printed it and flew to Bologna. I was showing this paper and I was talking about it. I received an API key for Harmonic's Aristotle. Many participants were trying Aristotle with their API keys that they received as participants of the workshop (Harmonic sponsored the event). I saw a mathematician saying "look at this interesting proof that Aristotle generated". That sounded promising. I asked Nico for the LaTeX source of his paper. I was ready to try Aristotle on the paper.

Formalization with Aristotle and Claude Code

The first time I asked Aristotle to formalize the paper, I waited several hours. I got some Lean code with a comment saying the budget was up. I asked Aristotle to continue and waited for several hours. I repeated this process for a while until Aristotle was not producing new statements anymore.

Then I started going back and forth between Aristotle and Claude Code. I asked Claude Code to break down lengthy proofs into lemmas, to state missing lemmas, and to add statements for completeness. I asked Aristotle to prove the conjectures that Claude Code stated. Aristotle sometimes gave counterexample to the Claude Code's conjectures. Aristotle sometimes proved the conjectures. The whole process took roughly a month and a half (I was never fully focused on this project).

For post-processing, I used Claude Code heavily. This involved removing lemmas that were never used, re-organizing lemmas into new file organization, and annotating the Lean code with references to specific locations of the paper.

What's next

The content of this project is very close to the FRI formalization project in Arklib. I think it's possible to adjust the definitions in this project so that the results can be incorporated into Arklib. The conventions in Arklib will have priority over whatever styles invented during this project.

We can also try instantiating this formalization and evaluating the soundness errors concretely.

Another direction is to prove that an implementation of a FRI verifier behaves the same way as the verifier in the paper. I'll be trying to figure out ways to get end-to-end results.

As for Charlie, he needs to learn to read Lean definitions and statements, though maybe not necessarily proofs. To be honest, he still needs to know a bit of cryptography to see that the results in Lean matter. But now, Charlie won't be completely at a loss. When some proof steps seem to be missing, the detailed lemmas in the Lean formalization can probably help fill the gap.

Thanks

This project started by Nico's initiative. He also gave me feedbacks on this post. The formalization project was possible because of access to Aristotle, which sometimes produced very involved proofs. ItaLean 2025 was great for interacting with other users and creators of Lean and autoformalization agents. Claude Code (even Sonnet 4.5) is surprisingly good at Lean. I think I'm very lucky to be around at this point in history.

The thumbnail image for this post was generated by Nano Banana Pro.