(Verifiable) AI is a Compiler
Modern computers are magic rocks, and programming is the art of making magic rocks do useful things. The challenge is that magic rocks only understand magic rock language, i.e., machine code such as x86 or ARM. Magic rock language is far too cryptic for humans, so clever humans invented high-level programming languages such as C++ and Python. Compilers, then, are the translators that turn those high-level languages into machine code.
Compiler as a human-to-rock translator.
From the compiler’s perspective, a high-level program is a specification of what machine code to generate. The key correctness property that a compiler must satisfy is that any such translation is always faithful – any two compiler invocations with the same input program must produce semantically equivalent machine code that fulfil the specification.
Programs as Imperative Specifications
To make that job tractable, compilers impose restrictions on what kinds of inputs they accept. In particular, the human programmer must describe precisely how a computation is performed.
For instance, in an imperative language like C++, the programmer can’t simply declare that they want the n-th Fibonacci number. Instead, they must explicitly specify control flow and memory updates. Functional languages such as OCaml are more declarative about control flow and state, but they are still imperative about algorithmic intent.
In short, even though they are more expressive, today’s high-level languages are still imperative at heart1. They resemble machine code far more than they resemble natural language.
Are LLMs Compilers?
It’s now 2025, the year of vide coding. Rather than write C++, we describe what we want in natural language: “Build a service that transfers money between accounts, ensuring balances never go negative.” Seconds later, the code appears.
Technically, the LLM outputs a high-level programming language like C++, which is then passed to a traditional compiler. But conceptually, that step is incidental. From a wider perspective, the LLM here is a key part of the compiler toolchain that translates human intent into machine language.
The problem is that these new “LLM compilers” violate the most fundamental contract of any real compiler: semantic faithfulness. LLMs are probabilistic. The same prompt, run twice on a given LLM, can produce different code, with neither version guaranteed to match the user’s intent. It might produce code that looks plausible but behaves incorrectly.
Verifiable Reasoning LLMs are Compilers
This is where formal verification step in. It lets us prove mathematically that a program satisfies its specification.
If we couple LLMs with verifiable reasoning, every piece of generated code would come with a proof that it satisfies the intended property. In the money-transfer example, the proof might state and verify that balances never go negative, even under concurrency and failure.
The result is verifiable program synthesis, a paradigm where the model not only writes code but also produces the evidence that the code is correct. Critically, AI is also perfectly suited to find such evidence, which I explained in a previous post.
In this new interface, programmers no longer write imperative instructions for how to compute. Instead, they write declarative specifications of what must hold2. The LLM serves as a reasoning engine that searches for implementations consistent with those specifications.
A New Programming Paradigm
Verifiable program synthesis lifts programming to a higher level of abstraction: programmers describe intent, the model synthesizes candidates, and formal verification forbids any incorrect results.
This does not mean that algorithmic thinking is obsolete. Programmers can, and indeed should, tell the LLM what algorithms or data structures to use, e.g., LRU or LFU policy for cache eviction, Paxos or Raft for state machine replication. But they’ll do so declaratively, by expressing the desired properties rather than wiring the mechanics. The LLM-compiler stack then takes care of correctness.