2.5. TorchLean API
So far, the guide has built tensors, layers, datasets, and training loops. This chapter gathers the public names behind that workflow so you can read TorchLean examples without getting lost in the repository tree.
For a first pass through TorchLean, read this as a guided API tour:
-
nnbuilds models; -
Databuilds datasets and loaders; -
trainruns fitting, prediction, callbacks, and reports; -
optimconfigures updates; -
autogradexposes lower level gradient tools when the training helper is not enough.
The first thing to keep in mind is the layering:
-
NN.API.Publicis the public user surface. -
NN.API.Runtimeexposes the lower layer executable runtime surface. -
NN.API.TorchLeanis the namespace that the runtime surface re-exports for ordinary code. -
NN.Entrypoint.*modules provide narrow wrappers for specialized proof or runtime paths.
Rule of thumb: use import NN and open NN.API first. Drop to the runtime layer only when the
runtime itself is the subject.
2.5.1. The First Line Of A Tutorial
The shortest useful setup is:
import NN open NN.API #check API.nn #check API.train #check API.optim #check TorchLean.Session.new
That is enough for the ordinary tutorial path. Users should not have to choose between a dozen internal imports before they have built their first model.
2.5.2. The Public API
NN.API.Public is the canonical module for ordinary user code. It is designed to feel familiar to
readers who know PyTorch:
-
API.nnfor layer builders and model constructors, -
API.Datafor datasets, loaders, and small preprocessing helpers, -
API.trainfor fit/predict loops and training utilities, -
API.optimfor optimizer configuration, -
API.autogradfor gradient, VJP, and Jacobian helpers, -
API.randfor deterministic RNG helpers.
Tutorials, notebooks, and examples should prefer this layer. It does not hide the runtime; it gives the runtime a stable and legible public interface.
The public API is meant to preserve one simple invariant:
\text{public model code}
\quad\leadsto\quad
\{\text{runtime execution},\text{graph artifact},\text{proof statement}\}
The user should not need to rewrite the model to move from a small training run to graph inspection or to a verifier fixture. The lower layer may require extra hypotheses, but it should be about the same model.
Two parts of the public API deserve special emphasis because they have grown substantially:
-
API.Dataincludes typed CSV/NPY readers, deterministic shuffling, minibatch loaders, and helper constructors for supervised and labeled datasets; -
API.trainspans both full fitting helpers and manual step APIs that feel familiar from PyTorch.
2.5.3. Data And Transform Surface
Closest TorchLean analogue to torch.utils.data plus a small, predictable preprocessing surface:
-
file sources:
Data.TensorSource,Data.SupervisedSource,Data.LabeledSource,Data.TabularSupervisedSource -
loading:
src.load(on each source) -
deterministic minibatching:
Data.batchLoaderandData.BatchLoader.epoch -
map-style transforms:
Data.Transforms(combinators you can apply to samples or datasets)
The most important design decision is the boundary format: TorchLean expects numeric tensors in
.npy (and small numeric CSV for tabular data). For anything else, we usually convert once with Python
tools.
The intended data contract is:
\mathrm{loader}(\mathrm{file})
\in
\mathrm{Except}\;\mathrm{String}\;
\left(\mathrm{Tensor}(\alpha,s_x)\times\mathrm{Tensor}(\alpha,s_y)\right)
Loading can fail, and shape checks happen before a dataset becomes training data. That is less convenient than accepting every file and hoping for the best, but it gives later training and verification code a typed object instead of an unchecked blob.
See the data contract documentation:
The public tutorials that exercise this surface are:
This is worth saying explicitly because TorchLean examples now cover both small tutorial fixtures and data backed training runs.
2.5.4. The Runtime Surface
NN.API.Runtime and NN.API.TorchLean expose the lower layer machinery that underlies the public
API. That surface includes:
-
typed and untyped tensor operations,
-
backend-neutral programs and compiled outputs,
-
session and execution control,
-
losses, norms, optimizers, and training helpers,
-
the explicit
eagerversuscompiledexecution distinction.
The inventory matters because the runtime layer keeps executable semantics visible enough for graph inspection, verification, and interop work.
Typical names in that layer include:
-
add,matmul,reshape,transpose2d,broadcastTo, -
linear,conv2d,layer_norm,multi_head_attention, -
trainCycleSGD,trainCycleOptim,meanLoss, -
Backend,Options,Program,CompiledOut.
Those names represent the lower half of the same public API:
\mathrm{Program}_\alpha
\;\xrightarrow{\mathrm{run}}\;
\mathrm{value},\mathrm{tape/log/graph},\mathrm{state'}
The returned artifact matters. It is what widgets display, what compiled execution replays, and what proof chapters relate back to the spec.
2.5.5. Training, Reporting, And Schedulers
The training API is best understood as two compatible layers.
2.5.5.1. High-level fit helpers
-
train.fitDataset -
train.fitLoaderWith -
train.meanLossDataset -
train.predict
These are the helpers most public tutorials use.
2.5.5.2. Low-level manual-loop helpers
-
train.stepper -
train.step -
train.evalMode -
train.withMode
Closest Lean analogue to a handwritten PyTorch training loop:
2.5.5.3. Hooks and reporting
-
train.Callbacks -
train.onTrainStart,train.onStep,train.onEpochEnd,train.onTrainEnd -
train.Report.*
These API entry points matter because a surprising amount of tutorial quality comes from reporting and monitoring code, not only in the model definition.
2.5.5.4. Optimizers and scheduler math
-
optim.sgd,optim.adam -
API.TorchLean.Schedulers.constant -
API.TorchLean.Schedulers.step -
API.TorchLean.Schedulers.exponential
The scheduler layer is intentionally small and pure: it gives higher layer training code a stable place to compute learning-rate policies without entangling that logic with runtime state.
2.5.6. Execution Modes
TorchLean uses the same public model code with two execution styles:
-
eagerfor debugging, gradients, and step by step inspection, -
compiledfor a graph artifact that can be replayed or verified more directly.
That distinction matters because it keeps the choice of semantics explicit. The code does not change just because the backend changes; only the execution mode changes.
Device note: the same APIs also compose with an optional CUDA backed float32 path when the
project is built with -K cuda=true. Public tutorials often show --cpu and --cuda on the
torchlean runner; the NN.API types do not change. Only the runtime buffer implementation does. See
Runtime and Autograd for the trust boundary and Example Walkthroughs for commands.
lake env lean --run NN/Examples/Quickstart/SimpleMlpTrain.lean -- --steps 200 --dtype float --backend compiled
The command above is a good reminder of one API invariant: the public API stays constant while the
backend varies.
Swapping --backend eager and --backend compiled shows the behavioral difference without rewriting
the model.
2.5.7. When To Use A Lower Layer
Most users can stay at import NN. Use a lower layer when the lower layer is exactly what you are
explaining:
-
NN.API.Publicfor ordinary examples. -
NN.API.RuntimeorNN.API.TorchLeanonly when the lower layer API is the topic. -
NN.Entrypoint.*for a narrow import tied to a specific narrative (specs, runtime, floats, or verification).
This ordering keeps the manual coherent: the public API appears first, then the runtime assembly.
2.5.8. Model Families Beyond The Smallest Tutorials
The public tutorials understandably emphasize MLPs and small CNNs, but the API/runtime surface has grown further:
-
residual CNN support via
nn.blocks.resnetBasicBlock, -
transformer style blocks via
nn.multiheadAttention,nn.layerNorm, andnn.blocks.transformerEncoderBlock, -
GraphSpec-backed runtime model wrappers such as
resnet18Program, -
verifier friendly operator learning work such as the
fno1druntime model wrapper.
Not all of these families are beginner examples, but they are part of the model building surface that advanced tutorials and experiments use.
2.5.9. GraphSpec And Tooling
Three adjacent namespaces are easy to overlook from import NN alone.
2.5.9.1. NN.Entrypoint.*
These are one-import wrappers for focused subsystems:
-
NN.Entrypoint.Spec -
NN.Entrypoint.Runtime -
NN.Entrypoint.Floats -
NN.Entrypoint.Verification -
NN.Entrypoint.GraphSpec
They are especially useful in focused guide files or demos where the import surface should say what the file is about.
2.5.9.2. NN.GraphSpec
GraphSpec is the typed architecture DSL described in its own guide. It is not the default public
API, but it is already connected to the runtime model zoo and to public demos such as
graphspec_mlp_demo.
2.5.9.3. Navigation tools
For exact declarations, use the generated API docs. For import structure, use the module graph. The guide keeps only the names needed to explain the main contracts, so it stays readable while the reference pages remain complete.
2.5.10. How the Pieces Fit Together
The split between public and runtime namespaces gives the project three practical benefits:
-
tutorials stay concise,
-
implementation details stay inspectable rather than hidden behind opaque objects,
-
verification chapters can refer to the same names as the examples.
That is especially important for a Lean project. The manual is not just teaching an API; it is also teaching how the API relates to the underlying semantics and theorem statements.
See Training From Scratch for workflows, Example Walkthroughs for curated demos, and PyTorch Round-Trip then TorchLean vs PyTorch for interop.
2.5.11. References
-
TorchLean paper: https://arxiv.org/abs/2602.22631
-
Lean module-system reference: https://lean-lang.org/doc/reference/latest/Source-Files-and-Modules/
-
PyTorch documentation for the corresponding surface concepts: https://pytorch.org/docs/stable/index.html