← Back to Home

What is it?

We are developing a subset of English with formal syntax and formal semantics, as a medium for rigorous, human-readable formal proofs.

Common questions

What is the formalization gap?

When someone formalizes a proof in an interactive theorem prover such as Lean or Rocq, the computer checks that the proof given to it is logically valid. But it cannot check whether the formal statement actually corresponds to what the mathematician meant. That's the formalization gap: the risk that what got proved in the machine isn't really what was claimed in the paper.

Does that ever happen?

More often than you'd think! When Apéry's proof of the irrationality of zeta(3) was first formalized, the formal version initially proved the wrong statement. Lawrence Paulson, in his blog post The de Bruijn Criterion vs the LCF Architecture, argues that the most important thing that can go wrong with a formal proof is not a bug in the proof checker but a mistake in how the definitions and theorem statements were written down in the first place. If the translation is wrong, the proof is useless.

Don't large language models solve this translation problem?

On the contrary, AI makes the formalization gap worse. When human experts formalize a proof by hnad, they bring mathematical understanding to the task. They can often tell when something has gone wrong because the formal proof stops following the expected line of reasoning. LLMs do not have such safeguards. They can produce formal code that compiles and passes the proof checker, but cannot ensure that the formal statement written down actually means the same thing as the informal statement it was asked to prove. The LLM can silently change definitions, weaken hypotheses, or prove trivially true versions of a hard theorem and the proof checker will happily accept all of these, because they are all logically valid.

How do you address the formalization gap?

The idea is simple: instead of writing in ordinary English and then asking the LLM to produce the formal counterpart, you write in a restricted subset of English that already has a precise, formal meaning. It reads like ordinary English mathematical prose, but every sentence has an unambiguous interpretation that a machine can parse directly. This means that there's no separate translation step where things can go wrong. What you write is what gets checked. The LLM has to work from what you actually wrote, rather than guessing at the intent, so we can avoid the hallucination problems that come up when large language models try to do mathematics on their own.

Do I have to learn a new language?

Not really. The language is English, just a carefully chosen subset of it that allows you to be precise about quantifier ordering and to avoid ambiguous pronoun references. Mathematicians already do much of this naturally when they write. The system comes with tools that flag ambiguities and help you stay within the subset, so the learning curve is gentle. The goal is to meet mathematicians where they are, rather than asking them to work in a completely artificial formal language.

Get in touch

Zoltan: zoltan@categorical-learning.net

Jerry: jerry@categorical-learning.net