FormalInception Automated Theorem Proving

Benchmark result ยท June 2026

100% on miniF2F

Our pipeline solved all 244 problems, with every proof checked by the Lean 4 compiler.*

244 / 244
Problems closed
$86.56*
Total cost

FormalInception-ATP is an agent system built for automated theorem proving. We ran it over the miniF2F benchmark and it closed every one of the 244 problems, with each proof checked by the Lean 4 compiler.* As far as we know, nobody has reported a perfect score on this set before.

Each problem was given to the prover with only its informal and formal statements as input. The system relies solely on the Lean compiler, Mathlib search, and deterministic tactic search. It has no access to the web. We used Gemini 3.5 Flash and Gemini 3 Flash as the underlying models.

Every proof was checked with Comparator on Lean 4.29.0, and all of them passed. The full set of proofs is below, together with a script that reproduces the verification, so anyone can re-run it and confirm the claim for themselves.

We plan to open-source the pipeline soon, and will write up how it works in more detail shortly after.

01

Authors

02

Cite this result

@misc{formalinception2026minif2f, title = {FormalInception-ATP: 100% on miniF2F}, url = {https://formalinception.github.io/}, author = {Mateusz Miraszewski and Marcin Rolbiecki and Mateusz Kowalski and Hubert Wasilewski and Grzegorz Juszczyk}, month = {June}, year = {2026} }

* Disclaimer: The 244/244 result was not obtained in a single run of the system. Due to a limited compute budget, it is the union of several runs with different agent settings. Cost is at Google Cloud flex pricing.