Public optimizers, losses, metrics, and training tools #
This module contains the executable training API: optimizer configs, loss exports, metrics, callbacks, loaders, and module-level training loops.
Optimizer configs for the public training APIs.
These mirror common PyTorch optimizers (by name and default hyperparameters), but they produce a TorchLean trainer config rather than a mutable optimizer object.
PyTorch references:
torch.optim:https://pytorch.org/docs/stable/optim.html
Optimizer hyperparameter configuration for the supervised training helpers.
This configuration covers the optimizer choices exposed by the public training helpers. It mirrors
a few common PyTorch optimizers by name/defaults, but it does not try to cover the full option surface of
torch.optim.*.
Instances For
SGD optimizer configuration.
Instances For
Adam optimizer config, written optim.adam { lr := 1e-3 }.
Instances For
AdamW optimizer config, written optim.adamw { lr := 1e-3, weightDecay := 0.01 }.
Instances For
Optimizer algorithm accepted by simple CLI commands that expose a --optim flag.
Instances For
Parse an optimizer name accepted by a command-line --optim flag.
Instances For
Human-readable optimizer name used in logs.
Instances For
Build a public optimizer config for this optimizer kind and learning rate.
Instances For
Reduction mode for losses that start as elementwise tensors.
PyTorch analogy: reduction="mean" or reduction="sum".
Instances For
Public training tools.
This namespace is the public training surface: it wires together
- a model (
nn.Sequential) - a loss (regression or classification)
- an optimizer config (
API.optim) - optional LR schedules
The API exposes a small set of reusable building blocks, so model commands can share the same training path while still making the model, loss, optimizer, and logging choices explicit.
Importantly, this module sits around the root public trainer facade:
- use
TorchLean.Trainer.RunConfigfor persistent runtime settings, - use
TorchLean.Trainer.TrainOptionsfor one training call, - use
Trainer.new ...followed bytrainer.train ...as the normal quickstart path, - use
API.train.Advancedonly when the dependent runner API is genuinely needed.
PyTorch Mapping #
These definitions correspond to the training loop code you would typically write around:
torch.optim.*- forward pass + loss
loss.backward()+ optimizer step- batching via
torch.utils.data.DataLoader
This module is the advanced training layer underneath the Trainer facade.
New examples should prefer Trainer.new, trainer.train, and trained-handle methods. This namespace
remains useful for advanced runtime code that really does need direct steppers, epochs, or manual
reporting, but it is no longer the surface we teach first.
Metric Artifacts #
The public training API also exposes TorchLean's metric artifact format. This is
the local equivalent of “log scalars during a run, then inspect them later”: write a JSON
TrainLog, view it with the training widgets, or adapt the JSON to an external tracker such as
Weights & Biases.
Advanced runner, callback, and custom-loop APIs.
This namespace keeps the dependent runtime surface available for CUDA entrypoints, custom loaders, RL/PDE streams, and proof-facing examples without making those names the default first stop for ordinary training code.
Advanced checked model-plus-loss package used by the direct runtime trainer APIs.
Ordinary user code should prefer Trainer.new and trainer.train.
Instances For
Advanced instantiated executable training state used by the direct runtime trainer APIs.
Ordinary user code should prefer the higher-level public trainer object.
Instances For
Advanced inner training-loop state used by the direct runtime trainer APIs.
Ordinary user code should prefer trainer.train unless it needs manual stepping.
Instances For
Count correct predictions in a one-hot labeled batched dataset.
This is the minibatch analogue of accuracyOneHot: the task already has a leading dim0 batch axis,
so we score each row of the batch independently and accumulate totals.
Returns (correct, total) where total = batch * numBatches.
Instances For
Mean loss over an entire dataset, used by before/after training reports.
Instances For
Callback event fired at the end of an epoch (how many steps ran).
Instances For
Hooks for instrumenting callback-based training loops.
Callbacks are ordinary IO hooks. They can print progress, update an in-memory curve, sample CUDA
allocator state, or forward events to a project-specific metrics backend.
Called once before training starts.
Called after each training step.
- onEpochEnd : EpochEvent → IO Unit
Called after each epoch.
- onTrainEnd : TorchLean.Trainer.TrainReport α → IO Unit
Called once after training finishes.
Instances For
No-op callbacks.
Instances For
Combine two callback collections by running them in sequence.
Instances For
∅ for callbacks: a no-op callback collection.
Build a training callback that samples the CUDA allocator at a fixed step cadence.
The callback owns a small IO.Ref for the previous sample, so examples can compose it with ordinary
loss-logging callbacks without threading allocator state through their training loops.
Instances For
Build callbacks that run at the end of each epoch.
Instances For
Build callbacks that run once at the end of training, with the final report.
Instances For
Step-indexed source of already-collated module inputs.
Data.batchLoader is the right interface when the data is a finite supervised dataset. Other
training jobs draw batches from a rule or an external source: replay buffers, collocation samplers,
synthetic scale inputs, or file-backed sequence windows. StepBatchStream is the direct stream
interface for those cases.
The stream is still fully typed: each produced sample is a TensorPack matching the module's
inputShapes. The training loop below is model-agnostic and only assumes that the module can run
forward and stepWith on those samples.
- sample : ℕ → IO (TensorPack α inputShapes)
Produce the input sample used at logical optimizer step
step.
Instances For
Constant stream for fixed-batch overfit runs and fixed-sample training jobs.
Instances For
Build a stream from a pure step-indexed sample function.
Instances For
Cycle through a nonempty list of samples.
This adapter lets list-backed datasets use the step-stream trainer. The explicit nonempty proof keeps empty datasets from turning into silent modulo-by-zero behavior.
Instances For
Run an action with the runner temporarily switched to value mode.
This is useful for "evaluate on a validation set during training" in callback-based loops.
Instances For
Mean loss for an already-instantiated scalar module over a typed minibatch loader.
This is the general streaming evaluation path used by the runtime examples. It is not
CIFAR-specific: any supervised task whose loss module consumes
[dim n σ, dim n τ] can use the same loader. The loader stores ordinary per-example samples
(x : σ, y : τ); this definition asks Data.epoch for raw minibatches and calls
Data.collateSupervised to build one shape-typed batch at a time.
Two details are important for larger examples:
- We force
shuffle := falsefor evaluation so before/after metrics are deterministic. - We do not call
Data.BatchLoader.batchDataset, because that would materialize every collated minibatch at once. Streaming keeps the same API usable for image, sequence, and scientific ML examples where the batch tensors are much larger than small tabular datasets.
Instances For
Mean loss over a typed minibatch loader through a train.Advanced.Runner.
This is the runner-facing form of meanLossModuleLoader. Use it when the example is built around
train.Advanced.run, task modes, and the proof-facing trainer abstraction. Use
meanLossModuleLoader directly when the example has already instantiated a runtime
TorchLean.Module.ScalarModule, which is the common fast path for CUDA examples.
Instances For
One-hot accuracy over a typed minibatch loader without materializing all collated batches.
Instances For
Train a runtime scalar module from a typed minibatch loader.
This is the shared "real epoch loop" for model examples that already have a runtime module, including CUDA runs. It mirrors the PyTorch structure:
- create an optimizer state for the module parameters;
- for each epoch, ask the general
Data.batchLoaderfor shuffled raw batches; - collate each raw batch into a shape-typed
(xBatch, yBatch)sample; - report the scalar loss through callbacks;
- run
forward/backward/optimizer.stepthroughTorchLean.Module.stepWith.
The function is polymorphic in the input shape σ, target shape τ, batch size n, scalar type
α, parameter shapes, and optimizer. It is not image-specific. CNN, ResNet, ViT, MLP,
sequence, operator-learning, and future model examples should all be able to use this path whenever
their supervised loss module has input shapes [dim n σ, dim n τ].
Instances For
Train a runtime scalar module for exactly steps optimizer updates.
trainModuleLoaderWith above is epoch-based: each unit means one full pass over the loader. This
variant is update-based, which is the convention used by runnable examples that expose a --steps
flag.
The loop still draws shuffled minibatches from Data.batchLoader epoch by epoch, but it stops as
soon as the requested number of optimizer updates has run. The returned loader is the advanced
loader state, so callers can continue training from the next shuffled epoch if they want to.
Instances For
Train a scalar module from a step-indexed batch stream.
This is the shared loop for workloads whose batches are produced step by step rather than by one
finite Data.batchLoader epoch:
- RL algorithms can sample replay or rollout batches,
- PDE examples can resample collocation points,
- generated workloads can stream synthetic inputs without storing a dataset.
The function is generic in inputShapes. It does not know whether the sample is
[x, y], [state, action, target], or []; it only asks the stream for the next typed input list
and then runs the same forward/backward/optimizer.step machinery as the loader-based trainer.
Instances For
Report-oriented stream-training entrypoint.
Callers pass the module, optimizer, runtime options, step count, and stream, and get standard before/after reporting plus CUDA memory watching.
Instances For
Float stream trainer that records a per-step loss curve.
Generated and file-backed batches do not always have one finite loader to summarize. This entrypoint keeps their training curves in the same JSON format as the supervised examples.
Instances For
Train from a runner-backed loader with explicit callbacks instead of inline printing in example code.
This is the runner-facing public path for PyTorch-style custom loops:
- keep the optimizer/scheduler logic in the library,
- inject logging, evaluation, and prediction reporting through callbacks.
This path keeps the Runner abstraction, including task modes and scheduler support. For
CUDA-heavy entrypoints that already have a TorchLean.Module.ScalarModule, prefer
trainModuleLoaderWith; both paths consume the same general API.Data.batchLoader.
Instances For
Create a Stepper loop for a runner and optimizer (optionally with an LR scheduler).
This corresponds to the “inner training loop” state in typical PyTorch code: an optimizer state plus (optional) schedule state, ready to step on a batch.
Instances For
Run one optimization step on a single supervised sample (one batch).
Instances For
Run one epoch over a list of supervised samples, returning the per-step losses.
Instances For
Small Reporting Helpers (IO) #
These definitions factor out common "print a loss/accuracy table" patterns for runnable model
commands.
They do not affect semantics: they only call the underlying runner functions and print
human-facing summaries. Public examples should reach them through Trainer.Advanced only when the
ordinary Trainer.new / trainer.train surface is too small for the example.
Convenience: mean loss on a dataset, printed with a label.
Instances For
Convenience: mean loss on a typed minibatch loader, streamed batch by batch.
Instances For
Convenience: mean loss on a typed minibatch loader for an already-instantiated runtime module.
Use this in direct CUDA/runtime examples to avoid building a Runner only for logging. The data
path is still the same public loader path: Data.batchLoader plus Data.collateSupervised.
Instances For
Report predicted classes on a list of named inputs.
Each entry is (name, x, expectedClass).
If includeLogits := true, also prints the raw model outputs.
Instances For
Report predicted classes on a list of named inputs, for a batched model.
This expects inputs of the unbatched input shape σ and replicates each one across the batch
axis, then reports the prediction for row 0.
Instances For
Convenience: mean loss + one-hot accuracy on a dataset, printed with a label.
Instances For
Batched variant of reportLossAccuracyOneHot.
Instances For
Loader variant of reportLossAccuracyOneHotBatched, streaming through minibatches.
Instances For
Train a runtime module for a fixed number of optimizer updates with the standard runtime reports.
This is the common path for direct-module training, not example-only code. It composes the generic step loop with before/after mean-loss reporting and CUDA allocator telemetry, while still accepting extra callbacks for projects that want their own metrics, validation, or tracing.
Instances For
Float-specialized module training that also records a scalar loss curve.
The training loop itself is the same as trainModuleLoaderStepsReport; this entrypoint adds the
standard Curve callback used by JSON logs and website widgets.
Instances For
Train a Float runtime module, write a standard scalar-curve log, and return the train report.
This is the high-level path used by runnable training commands. The caller provides the model, optimizer, loader, runtime options, and metadata notes; the library owns the callback composition, CUDA telemetry, before/after reports, and JSON curve emission.