Formalizing De Giorgi-Nash-Moser theory in Lean

Julia Kempe and I have just completed a Lean 4 formalization of De Giorgi–Nash–Moser theory. We formalized the full slate of interior regularity statements for weak solutions of divergence-form elliptic equations with merely bounded measurable (and not necessarily symmetric) coefficients, including local boundedness, the weak Harnack and Harnack inequalities, and finally interior Hölder regularity.

The Hölder estimate is one of the central results of mid-20th century analysis, and the ideas and methods developed by Ennio De Giorgi, John F. Nash Jr. and Jürgen Moser play a fundamental role in the modern theory of partial differential equations. These theorems are now part of the standard toolkit of PDE, but they are also technically demanding enough that, at least to me, machine-formalizing them had always seemed like a very distant prospect.

The project is now available as a Lean development built on top of Mathlib. It is fully formalized (sorry-free and axiom-free, beyond Lean and Mathlib), meaning that the theorems are unconditionally machine-checked. The repository contains about 56,000 lines of Lean code. It was “vibe-coded” by the two of us over a two-week period, using our personal LLM subscriptions. No human hand actually touched the Lean files. Instead, we supplied detailed blueprints of the proofs and talked our LLMs through the sticky points.

A faithful depiction of our formalization implementation

We learned a lot in the process, but the most interesting lesson is that serious modern analysis can now be formalized in Lean much faster than I would have guessed, even by mathematicians who are not experienced Lean users.

In this post I want to explain what is in the formalization, why we decided to do it, and why we think the result is interesting not only as a verification artifact, but also as evidence that serious modern analysis is now within reach of formalization.

What is in the formalization?

The repository formalizes the weak-solution side of elliptic De Giorgi–Nash–Moser theory in Lean 4.

At the top level, the development proves the standard regularity chain for uniformly elliptic divergence-form equations with bounded measurable coefficients:

local boundedness for subsolutions,
weak Harnack for supersolutions,
Harnack inequality for solutions,
Hölder regularity for solutions.

To get there, the project also builds a substantial amount of analytic infrastructure. In particular, it includes formalized Sobolev spaces on bounded domains built from weak derivatives and key properties and inequalities such as density, extension, and chain-rule results; weak formulation machinery for elliptic PDE; and functional inequalities such as Poincaré, Sobolev–Poincaré, and John–Nirenberg inequalities. On top of this, we formalized the De Giorgi iteration, Moser iteration for both subsolutions and supersolutions, the crossover estimate leading to the weak Harnack inequality, and finally the Harnack inequality and the oscillation-decay mechanism leading to Hölder regularity.

To our knowledge, this is also the first proof-assistant formalization of Sobolev spaces built from weak derivatives at this level of generality, and the first large-scale formalization of a modern PDE regularity theorem in this setting.

Why we decided to do this

Formalizing mathematics in Lean is a very exciting development and Mathlib is already a fantastic project. But analysts, and especially PDE people, have mostly been watching from the sidelines. Part of the reason is obvious: modern analysis looks like exactly the kind of subject that ought to be painful to formalize. There are many layers of infrastructure, a lot of quantitative bookkeeping, and very few explicit formulas to hide behind. We have to carry around measurability certificates, almost-everywhere statements, admissible test functions, and so on. If one had asked me not long ago whether a project like this was realistic, I would probably have said yes in principle, but only with a large investment of time and a very experienced Lean team. (As recently as January I gave it serious thought and decided it was NOT realistic.)

In fact, before starting, we asked Claude for a rough estimate of how long a project like this should take. Its answer was that, for a small team of dedicated and experienced Lean programmers, one should expect the work to take 2–4 years. Honestly, that did not sound unreasonable to me.

One of the reasons we wanted to do this project was to test whether that picture is still correct.

Evidently it is not.

One goal of this project was therefore simply to jump-start more analysis in Lean. I think a lot of analysts have had the feeling that formalization is exciting, that Mathlib is impressive, but that “our part of mathematics” is still too far away. At least from this experiment, I do not think that is true anymore.

What was difficult

The hard part was not writing down the De Giorgi and Moser iterations. The hard part was building enough of the surrounding analytic setup that the theorem statements and proofs could be expressed in a workable way.

1. Sobolev spaces and weak derivatives

If you want to formalize modern PDE theory, you need a robust way to talk about weak derivatives and Sobolev functions. This is not just a matter of introducing a definition and proving a few lemmas. One needs the right interfaces for witnesses, locality, restriction, approximation, and transport across domains.

In this project, a lot of care went into setting up the Sobolev layer so that downstream arguments could be expressed naturally. In retrospect, this was probably one of the most important architectural decisions in the whole development.

2. Truncations and admissible test functions

Many of the core PDE arguments proceed by inserting nonlinear expressions in u into the weak formulation. Formally, this means proving that various truncations and regularized powers are admissible test functions, and then controlling their weak gradients in the right way.

This is standard mathematics, but it is not lightweight mathematics. In a proof assistant, every step of this infrastructure has to be built explicitly.

3. Change of variables, extension, and localization

The unit-ball reduction is standard on paper, but formalizing it required a significant amount of geometric and analytic infrastructure: rescaling, extension operators, approximation, shell constructions, and change-of-variables estimates. Again, none of this is conceptually surprising, but all of it must actually exist in the library if the final PDE theorems are going to have a clean statement.

4. Iteration arguments

The De Giorgi and Moser iterations are elegant on paper, but they are very unforgiving formally. Every exponent, radius sequence, cutoff function, and quantitative inequality has to line up exactly. There is no room for the sort of handwaving one gets away with in blackboard proofs. This is precisely what makes the final result satisfying.

5. Lean elaboration itself

Some of the main difficulties were not mathematical in the usual sense, but structural. Large formal proofs in analysis can put substantial pressure on Lean’s elaborator. If definitions are not packaged carefully, or if a proof is not decomposed at the right places, elaboration cost can explode. In practice, this means that part of the work is architectural: deciding how to split files, how to package helper lemmas, and how to present intermediate objects so that the system continues to behave well.

This is not a glamorous part of the story, but it matters. A project like this is not just a sequence of theorem statements and proofs. It is also a fairly delicate exercise in making a large formal development legible both to human readers and to Lean itself.

By the way, if you are wondering if we intervened in fixing Lean elaboration issues: this part was also done by the LLMs. We just scolded them when elaboration time was blowing up and made them write down their elaboration-fixing tricks in markdown files so that we could remind them about it again later.

6. The L^p and Euclidean-space interfaces

Another persistent source of friction was the interface between the mathematics one wants to write and the abstractions that already exist in Lean and Mathlib. In particular, the L^p layer and the Euclidean-space layer were both more awkward for this project than one might naively expect. (Some advice: if you want to formalize some hard analysis, avoid using “EuclideanSpace” from Mathlib. In practice, it makes elaboration explode.)

Much of elliptic PDE wants to move freely between pointwise functions, almost-everywhere equivalence classes, weak derivatives, matrix-valued coefficients, and Euclidean vector calculus. All of this is mathematically standard, but formalizing it means crossing a lot of interfaces very explicitly. A surprising amount of work went into building lemmas and wrappers whose real purpose was not to prove a theorem directly, but to let the main PDE arguments pass cleanly from one formal setting to another.

7. Cleaning up the code

Once we had Lean code compiling that certified our theorems, we still had a huge mess. There were many vestigial lemmas no longer needed, duplicate lemmas, and the compiler generated 739 “linter warnings” complaining that our code was not elegant. Some of our largest lean files had 10k lines, which is much too long for an elegant code base (as our LLMs informed us). Our vibe-coded code was disgusting, so we spent the last few days just cleaning the code. To be precise, we yelled at the LLMs to clean the code.

Most likely this part was done imperfectly, and we expect that Lean pros will still find our code lacking in elegance. (We did our best!)

Final thought

At least from my point of view, the main lesson of this project is not simply that one more theorem package has been checked by a proof assistant. It is that analysts probably should not be standing on the sidelines anymore.

Lean and Mathlib are already remarkable achievements. What seems different now is that, with the latest LLMs and sufficiently close supervision, it may be possible to bring a great deal of modern analysis into that ecosystem much faster than many of us would have guessed. Two weeks is obviously not a universal benchmark, and not every project will go like this one. But it is hard for me to look at this experiment and conclude anything other than that the situation has changed quite a lot, and very recently.