3.1. Execution Modes and Runners🔗

By this point in the guide, we have tensors, layers, datasets, and a small training loop. The next question is how the same model is actually run.

TorchLean keeps that choice explicit. A model definition does not secretly become a different mathematical object when we choose a backend, a scalar type, or a device. Those choices decide how the computation is executed and which runtime artifact is produced.

3.1.1. The Three Runtime Choices🔗

Most runnable examples answer three questions.

Which scalar type should the program use?
Which execution backend should run the graph?
Which device should hold the numeric buffers?

Typical command line flags look like:

lake env lean --run NN/Examples/Quickstart/SimpleMlpTrain.lean -- \
  --dtype float --backend eager --steps 100

lake env lean --run NN/Examples/Quickstart/SimpleMlpTrain.lean -- \
  --dtype float --backend compiled --steps 100

With CUDA enabled, model demos that support device buffers add a device choice:

lake exe torchlean mlp --cuda --steps 100

The important point is that these flags do not change the model architecture. They choose the runtime representation used to evaluate it.

3.1.2. What A Runner Owns🔗

The public training examples usually enter the runtime through train.run:

train.run task args (fun {α} _ _ _ _ runner rest => do
  ...
)

Read this callback as:

α is the concrete scalar type selected by the command line, such as Float or Float32.
runner is the instantiated runtime object for the task.
rest contains the remaining command line arguments after the common runtime flags have been parsed.

The runner owns the pieces a training loop needs:

initialized parameters and buffers,
compiled predictors for train and eval mode,
compiled losses for train and eval mode,
the current mode flag,
update helpers used by optimizers and fitting loops.

The main API surface is:

3.1.3. Eager Mode🔗

Eager mode is the easiest backend to reason about while debugging. It records operations as they run, keeps a tape of parent links and local reverse rules, and returns explicit gradients.

The closest PyTorch analogy is:

loss = model(x).loss(y)
loss.backward()

In TorchLean, the corresponding data remains explicit:

the tape is a Lean value,
gradients are returned as tensors or parameter bundles,
widgets can inspect the recorded graph.

Use eager mode when the goal is to understand one step, inspect a gradient, or explain what a small example is doing.

3.1.4. Compiled Mode🔗

Compiled mode builds a stable graph-shaped runtime artifact before repeated execution. It is the better default for longer runs over the same model because the graph structure is fixed once the program is compiled.

The closest PyTorch analogy is the intuition behind torch.compile: run the same model, but first turn its computation into a reusable graph. TorchLean's compiled path is more explicit because the artifact is part of the Lean runtime story and can be related to the IR and proof layers.

Use compiled mode when the model and loss are fixed and the training loop will run many steps or many batches.

3.1.5. Train Mode and Eval Mode🔗

Some layers behave differently while fitting than they do while evaluating. Dropout and batch normalization are the common examples. TorchLean keeps this state in the runner, rather than pretending that every layer is purely stateless at runtime.

The mental model is the same as PyTorch's model.train() and model.eval(), but the mode is part of the explicit runtime object.

3.1.6. Loaders and Epochs🔗

The loader API supports the familiar nested loop:

for epoch in epochs:
  for batch in loader:
    step(batch)

In TorchLean this appears as fitting helpers over datasets and loaders. The helper does not change the model definition; it only organizes repeated calls to the same runner.

The declarations to open when reading the code are:

train.FitConfig
train.LoaderFitConfig
train.fitDataset
train.fitLoader

Those live in NN.API.Runtime API, with public re-exports in NN.API.Public API.

3.1.7. Same Model, Different Artifact🔗

It helps to keep this small map in mind:

--dtype float versus --dtype float32 changes the scalar representation. Tensor shapes and model structure stay fixed.
--backend eager changes the runtime artifact to tape style execution. The public model definition stays fixed.
--backend compiled changes the runtime artifact to a reusable compiled graph. The public model definition stays fixed.
--cpu versus --cuda changes where supported numeric buffers live. The Lean specification and declared trust boundary stay fixed.

That separation is the reason TorchLean can be used as a tutorial framework, an executable experiment harness, and a verification codebase at the same time.

3.1.8. What To Read Next🔗

For a small runnable example, open SimpleMlpTrain. For minibatches and epochs, open MinibatchMlpTrain. For the declarations behind the runner, open NN.API.Runtime.