1.1. What TorchLean Is🔗

TorchLean is a Lean 4 project for neural network semantics, runnable examples, and verification artifacts. Its guiding choice is simple: the model we run, the model we inspect as a graph, and the model we mention in a theorem should be recognizably the same object. We therefore write supported neural network programs in Lean, keep their shapes and state visible, and connect execution-oriented representations to mathematical meanings.

TorchLean is meant to be useful to three overlapping communities:

ML engineers who want familiar model-building, training, export, and inspection workflows.
Verification researchers who want graph semantics, bound propagation, certificate checkers, and clear trust boundaries.
Lean users who want definitions and theorems that refer to the same artifacts the examples run.

The broad user import is NN root API. Narrower entry points include NN.Entrypoint.API, NN.Entrypoint.GraphSpec, NN.Entrypoint.IR, and NN.Entrypoint.Verification.

1.1.1. What TorchLean Covers🔗

The codebase is broad, but the pieces serve a common purpose. TorchLean includes:

a Lean frontend with PyTorch vocabulary for layers, datasets, optimizers, losses, and training loops;
tensors indexed by shape, so many dimensional mistakes are caught while Lean checks the file;
scalar generic examples that can run over ordinary Float, executable float32, or proof side scalar domains;
a functional autograd runtime with eager tapes, optimizers, logging, and save/reload paths;
a canonical graph IR used by compilation, widgets, verifier passes, and proof statements;
GraphSpec, which lets architectures be described once and interpreted both as executable programs and as symbolic IR;
floating-point infrastructure split into executable IEEE behavior (IEEE32Exec), proof side float32 (FP32), and bridge theorems or assumptions between them;
an optional CUDA backend for float32 training, with native kernels, cuBLAS/cuFFT integration, CPU stubs, sanitizer scripts, and explicit trust boundary documentation;
verification passes and checkers for IBP/CROWN bounds, JSON certificates, robustness margins, PINN residuals, ODE enclosures, and VNN-COMP artifacts;
examples for MLPs, CNNs, ResNets, ViTs, Transformers, GPT models, Mamba models, diffusion, FNO operator learning, PPO, and small data pipelines.

Not every item above has the same proof status. Some are executable infrastructure, some are checked artifacts, and some are proved Lean theorems. Keeping those roles separate is one of TorchLean's central invariants.

1.1.2. The Design Choice🔗

Ordinary ML systems are often optimized around fast experimentation. That is the right default for many projects, but it can leave important facts implicit: tensor shapes, parameter layouts, training mode, random state, scalar semantics, and the exact graph a verifier checked. TorchLean makes a different tradeoff. We accept more typed structure at the boundary so that later tools can state precisely what they consumed and what they proved.

For example, a loss that expects logits and one-hot labels of shape [batch, classes] should not silently receive labels of shape [batch]. A matrix multiply whose right-hand dimension is 64 should not be applied to activations whose last dimension is 128. In a dynamic runtime, those errors may appear only after a particular batch reaches a particular kernel, or worse, they may be hidden by broadcasting. In TorchLean's typed tensor APIs, the shape is part of the tensor type, so the mismatch is rejected before the example is treated as a runnable model. If the intended design really is sparse labels, a squeeze, or a reshape, we name that design choice explicitly.

The same discipline applies to semantic claims. A statement such as "this classifier is robust on this input box" becomes useful only when it says which graph, parameters, scalar interpretation, and checker result support the claim. Lean gives us a place to write that statement as a checked object rather than as a comment beside a script.

1.1.3. Three Views Of One Model🔗

At the top level, TorchLean provides a facade with PyTorch vocabulary over a shared internal model. User code mostly starts from import NN and open NN.API, then works with familiar concepts: layers, datasets, optimizers, losses, and training loops.

Under the hood, the same model appears in three representations:

Spec layer: the mathematical meaning of tensors, layers, losses, and model structure.
Graph IR: the op-tagged DAG shared by runtime tooling, widgets, export, and verification.
Runtime layer: eager or compiled execution, autograd, optimizers, logging, and optional CUDA.

This alignment has a simple goal: the model you run, the model you inspect, and the model you state theorems about are connected by explicit translations rather than by informal correspondence.

1.1.4. The Object We Keep Fixed🔗

The central discipline is that a model is first a typed mathematical object. Informally, we keep returning to a shape like:

\mathrm{Model}_{\theta} : \mathrm{Tensor}(\alpha,s_{\mathrm{in}}) \longrightarrow \mathrm{Tensor}(\alpha,s_{\mathrm{out}})

The runtime may choose Float, executable IEEE32Exec, or CUDA-backed float32 buffers. A proof may instantiate the same architecture over real numbers, interval domains, or proof side float32 models. The architecture statement should not change just because the execution substrate changes.

That gives the whole project a repeated theorem pattern:

\begin{aligned} &H(M,R)\\ &\Longrightarrow\; \forall x,\; \mathrm{denote}(R)(x) \in \mathrm{safeSet}(\mathrm{denote}(M)(x)). \end{aligned}

For an exact theorem, safeSet is a singleton. For a finite precision or verifier theorem, it may be an interval, box, affine enclosure, or certified output region.

Different chapters instantiate that pattern in different ways. Autograd statements relate a reverse pass to the derivative of a denotation. Runtime approximation statements relate executable values to real specifications. CROWN and certificate checkers establish bound properties of an IR graph. CUDA chapters name the native boundary when the implementation lives outside Lean.

1.1.5. Fast When Needed, Explicit When It Matters🔗

TorchLean separates fast execution from proof, but it does not treat them as unrelated worlds. For prototyping, examples can use the host runtime or optional CUDA-backed float32 paths. For guarantees, we move to graph denotations, Float32 models, certificate checkers, and Lean theorems. The boundary is visible: some parts are proved in Lean, and some parts are external systems that must be named and checked around.

A typical runtime-only bug is a checkpoint whose parameter names load successfully in a script while one weight has been transposed to match a different convention. The model may still run if the surrounding dimensions happen to agree, but a verifier might then certify a graph-payload pair that is not the deployed computation. TorchLean's design pushes parameter payloads, graph lowering, and shape checks into explicit data so that this kind of agreement can be inspected and, where the library has the theorem support, proved.

1.1.6. How The Book Is Organized🔗

The rest of the guide follows the path a model takes through the project:

The introduction establishes the motivation and the shared mental model.
Building Models shows the public runnable path.
Runtime and Interop explains execution, autograd, checkpoints, and PyTorch-facing boundaries.
Semantics and Graphs fixes the graph and scalar meanings used by later tools.
Verification and Certificates explains proof obligations, checker outputs, and remaining assumptions.
Examples, Tools, and Conclusion tours the model zoo, widgets, command line workflows, and final guarantee story.