1.6. TorchLean vs. PyTorch🔗

PyTorch is an excellent system for broad, high performance neural network engineering. TorchLean has a different goal. The comparison in this chapter is narrower: TorchLean borrows familiar PyTorch vocabulary where that helps users write runnable models, and then adds Lean types, semantics, and proof or certificate hooks around the supported workflows.

The practical question is:

What extra structure do we need if a model is not only going to run, but also be inspected,

exported, imported, and connected to a formal claim?

1.6.1. Modules And Parameters🔗

In PyTorch, a model is usually an nn.Module. The module owns its parameters and buffers, and calling model(x) runs forward using that internal state. This is the right ergonomic choice for many training scripts.

# PyTorch
model = torch.nn.Sequential(
    torch.nn.Linear(10, 32),
    torch.nn.GELU(),
    torch.nn.Linear(32, 5),
)
out = model(x)

TorchLean keeps the authoring style familiar, but separates the model description from the parameter payload that will be executed, saved, lowered, or verified.

-- TorchLean
import NN

open NN.Tensor
open NN.API

def model : nn.M (nn.Sequential (Shape.Vec 10) (Shape.Vec 5)) :=
  nn.sequential![
    nn.linear 10 32 (pfx := Shape.scalar),
    nn.gelu,
    nn.linear 32 5 (pfx := Shape.scalar)
  ]

def built := nn.build 2026 model

Informally, the forward pass is a function of both the architecture and the parameters:

\operatorname{forward}(\operatorname{architecture},\theta,x)=y

That is the first major difference. In PyTorch, state_dict() exposes the parameters when needed. In TorchLean, the parameter bundle is already part of the ordinary data path. This makes it easier for graph lowering, checkpoint exchange, and theorem statements to refer to the same weights.

1.6.2. Shapes And Types🔗

PyTorch tensors are dynamically shaped. That flexibility is one of PyTorch's strengths: a script can build tensors from files, batch them in many ways, and dispatch to highly optimized kernels at runtime. The cost is that many shape mistakes are discovered only when a particular execution path hits a mismatched operation, or worse, when broadcasting performs a valid but unintended operation.

TorchLean uses Lean's dependent types for the core tensor APIs. A tensor carries both a scalar type and a shape:

\operatorname{Tensor}(\alpha,s)

This does not mean every possible data problem is solved statically. Files can still be malformed, an imported payload can still be rejected, and dynamic loaders can still fail. The difference is that once a tensor has entered the typed core, many dimensional contracts are checked before the runtime kernel is reached.

That tradeoff is intentional. TorchLean gives up some of Python's dynamism in exchange for a stronger statement about the computations that do elaborate.

1.6.3. Training And Autograd🔗

A standard PyTorch loop mutates optimizer state and accumulates gradients in tensor fields:

for step in range(steps):
    opt.zero_grad()
    pred = model(x)
    loss = mse(pred, y)
    loss.backward()
    opt.step()

TorchLean keeps the same conceptual rhythm, but the training state is explicit. Gradients are returned by the differentiation machinery, optimizer state is passed through the step, and logs are values that can be rendered or persisted.

(\theta,\mathrm{optState},\mathrm{rng},x,y) \longmapsto (\theta',\mathrm{optState}',\mathrm{rng}',\mathrm{report})

This is not a claim that TorchLean's trainer is a faster or more complete replacement for PyTorch's autograd engine. PyTorch's autograd is mature, broad, and deeply optimized. TorchLean's advantage is semantic access: for supported fragments, the backward pass can be related to a Lean specification and the resulting artifacts can be inspected inside the same formal environment.

For the public training surface, see the public API facade and the training runtime.

1.6.4. Graphs And Verification🔗

PyTorch has several graph-oriented tools, including tracing, FX, torch.export, and compiler pipelines. They are engineering tools for capture, transformation, optimization, and deployment. They are not, by themselves, Lean theorem statements about a model.

TorchLean's graph path is designed for a different use case. A supported model can be lowered to NN.IR.Graph, a graph whose nodes carry operation tags and a Lean denotation. Verifier passes and certificate checkers can then talk about that graph directly:

\operatorname{NN.IR.Graph.denoteAll}(g,\theta,x)

A robustness checker, for example, should not have to trust a training script. It should receive a graph, a parameter payload, an input region, and a certificate or bound propagation result whose meaning is defined in Lean.

This is where TorchLean most clearly differs from a PyTorch clone. The goal is not simply to run the same model syntax. The goal is to keep the runnable workflow connected to a semantic object that proofs and checkers can cite.

1.6.5. Import And Export🔗

PyTorch remains valuable at the boundary. It has the ecosystem for large datasets, pretrained checkpoints, distributed training, debugging tools, and deployment practices. TorchLean therefore supports interop, but it keeps the contract deliberately small.

The supported round-trip is family based:

choose a known architecture family,
agree on parameter names, order, and tensor shapes,
train or edit the payload in Python,
serialize the payload,
re-import it into Lean and check the names and shapes again.

The relevant APIs are PyTorch export, PyTorch import, and the round-trip examples. This is not a universal importer for arbitrary Python modules, and it is not meant to be. A narrow contract is easier to audit and easier to connect to verification.

1.6.6. Floating Point And Kernels🔗

PyTorch users normally rely on the behavior of the selected backend: CPU kernels, CUDA kernels, vendor libraries, compiler fusions, and device-specific floating-point choices. That is appropriate for practical ML engineering, but it leaves a verification question: which numeric semantics did the claim use?

TorchLean names the relevant layers separately:

real valued specifications for clean mathematical statements,
proof side float32 models such as FP32,
executable IEEE binary32 models such as IEEE32Exec,
native runtime kernels behind explicit assumptions.

This separation does not make native kernels disappear from the trust story. It makes the trust story precise. A theorem can say whether it is about real semantics, executable binary32 semantics, or a native backend related to those semantics by a stated bridge.

1.6.7. When To Use Which🔗

Use PyTorch when you need the full production ecosystem: broad model coverage, mature GPU kernels, distributed training, pretrained checkpoints, and standard deployment integrations.

Use TorchLean when the supported model family needs to be a Lean object: when shapes should be part of the type, when the graph and payload should be inspectable, when imported weights need a checked contract, or when a verifier result should connect to a formal statement.

The intended workflow can include both systems. A team might prototype or train in PyTorch, export a known payload, import it into TorchLean, inspect the graph, and check a robustness certificate. Or it might author a small model directly in TorchLean and keep the whole run inside Lean. In either case, TorchLean's contribution is not replacing PyTorch; it is adding typed, semantic, and proof-aware structure around the parts of the workflow that need it.