TorchLean API

NN.API.Public.Training.Core

Public optimizers, losses, metrics, and training tools #

This module contains the executable training API: optimizer configs, loss exports, metrics, callbacks, loaders, and module-level training loops.

Optimizer configs for the public training APIs.

These mirror common PyTorch optimizers (by name and default hyperparameters), but they produce a TorchLean trainer config rather than a mutable optimizer object.

PyTorch references:

@[reducible, inline]

Optimizer hyperparameter configuration for the supervised training helpers.

This configuration covers the optimizer choices exposed by the public training helpers. It mirrors a few common PyTorch optimizers by name/defaults, but it does not try to cover the full option surface of torch.optim.*.

Instances For

    SGD optimizer configuration.

    • lr : Float

      Learning rate.

    • momentum : Float

      Momentum coefficient.

    Instances For
      @[implicit_reducible]

      Adam optimizer configuration.

      • lr : Float

        Learning rate.

      • beta1 : Float

        First moment coefficient.

      • beta2 : Float

        Second moment coefficient.

      • epsilon : Float

        Numerical stabilizer.

      Instances For
        @[implicit_reducible]

        AdamW optimizer configuration.

        Instances For

          SGD optimizer config, written optim.sgd { lr := 0.05 }.

          Instances For

            Momentum SGD optimizer config. Pass lr and optionally override momentum.

            Instances For

              Adam optimizer config, written optim.adam { lr := 1e-3 }.

              Instances For

                AdamW optimizer config, written optim.adamw { lr := 1e-3, weightDecay := 0.01 }.

                Instances For
                  @[reducible, inline]

                  Optimizer algorithm accepted by simple CLI commands that expose a --optim flag.

                  Instances For
                    @[reducible, inline]

                    Parse an optimizer name accepted by a command-line --optim flag.

                    Instances For

                      Human-readable optimizer name used in logs.

                      Instances For

                        Build a public optimizer config for this optimizer kind and learning rate.

                        Instances For
                          @[reducible, inline]

                          Reduction mode for losses that start as elementwise tensors.

                          PyTorch analogy: reduction="mean" or reduction="sum".

                          Instances For

                            Public training tools.

                            This namespace is the public training surface: it wires together

                            The API exposes a small set of reusable building blocks, so model commands can share the same training path while still making the model, loss, optimizer, and logging choices explicit.

                            Importantly, this module sits around the root public trainer facade:

                            PyTorch Mapping #

                            These definitions correspond to the training loop code you would typically write around:

                            This module is the advanced training layer underneath the Trainer facade.

                            New examples should prefer Trainer.new, trainer.train, and trained-handle methods. This namespace remains useful for advanced runtime code that really does need direct steppers, epochs, or manual reporting, but it is no longer the surface we teach first.

                            Metric Artifacts #

                            The public training API also exposes TorchLean's metric artifact format. This is the local equivalent of “log scalars during a run, then inspect them later”: write a JSON TrainLog, view it with the training widgets, or adapt the JSON to an external tracker such as Weights & Biases.

                            Advanced runner, callback, and custom-loop APIs.

                            This namespace keeps the dependent runtime surface available for CUDA entrypoints, custom loaders, RL/PDE streams, and proof-facing examples without making those names the default first stop for ordinary training code.

                            @[reducible, inline]

                            Advanced checked model-plus-loss package used by the direct runtime trainer APIs.

                            Ordinary user code should prefer Trainer.new and trainer.train.

                            Instances For
                              @[reducible, inline]

                              Advanced instantiated executable training state used by the direct runtime trainer APIs.

                              Ordinary user code should prefer the higher-level public trainer object.

                              Instances For
                                @[reducible, inline]

                                Advanced inner training-loop state used by the direct runtime trainer APIs.

                                Ordinary user code should prefer trainer.train unless it needs manual stepping.

                                Instances For
                                  def NN.API.train.Advanced.accuracyOneHotBatched {σ : Spec.Shape} {classes batch : } {task : Task (Spec.Shape.dim batch σ) (Spec.Shape.dim batch (Tensor.Shape.Vec classes))} {α : Type} [Semantics.Scalar α] [DecidableEq Spec.Shape] (runner : Runner α task) (samples : List (sample.Batch α batch σ (Tensor.Shape.Vec classes))) :

                                  Count correct predictions in a one-hot labeled batched dataset.

                                  This is the minibatch analogue of accuracyOneHot: the task already has a leading dim0 batch axis, so we score each row of the batch independently and accumulate totals.

                                  Returns (correct, total) where total = batch * numBatches.

                                  Instances For
                                    def NN.API.train.Advanced.meanLossDataset {σ τ : Spec.Shape} {task : Task σ τ} {α : Type} [Semantics.Scalar α] [DecidableEq Spec.Shape] [ToString α] (runner : Runner α task) (dataset : Runtime.Autograd.Train.Dataset (SupervisedSample α σ τ)) :
                                    IO α

                                    Mean loss over an entire dataset, used by before/after training reports.

                                    Instances For

                                      Callback event fired after each training step.

                                      • epoch :

                                        Current epoch number.

                                      • step :

                                        Global optimizer-step counter.

                                      • loss : α

                                        Loss reported for this step.

                                      Instances For

                                        Callback event fired at the end of an epoch (how many steps ran).

                                        • epoch :

                                          Epoch number that just completed.

                                        • steps :

                                          Number of steps executed in the epoch.

                                        Instances For

                                          Hooks for instrumenting callback-based training loops.

                                          Callbacks are ordinary IO hooks. They can print progress, update an in-memory curve, sample CUDA allocator state, or forward events to a project-specific metrics backend.

                                          Instances For

                                            No-op callbacks.

                                            Instances For

                                              Combine two callback collections by running them in sequence.

                                              Instances For
                                                @[implicit_reducible]

                                                for callbacks: a no-op callback collection.

                                                @[implicit_reducible]

                                                Callbacks form a monoid under sequential composition.

                                                Build callbacks that run action once at the start of training.

                                                Instances For

                                                  Build callbacks that observe every training step.

                                                  Instances For

                                                    Build a training callback that samples the CUDA allocator at a fixed step cadence.

                                                    The callback owns a small IO.Ref for the previous sample, so examples can compose it with ordinary loss-logging callbacks without threading allocator state through their training loops.

                                                    Instances For

                                                      Build callbacks that run at the end of each epoch.

                                                      Instances For

                                                        Build callbacks that run once at the end of training, with the final report.

                                                        Instances For

                                                          Step-indexed source of already-collated module inputs.

                                                          Data.batchLoader is the right interface when the data is a finite supervised dataset. Other training jobs draw batches from a rule or an external source: replay buffers, collocation samplers, synthetic scale inputs, or file-backed sequence windows. StepBatchStream is the direct stream interface for those cases.

                                                          The stream is still fully typed: each produced sample is a TensorPack matching the module's inputShapes. The training loop below is model-agnostic and only assumes that the module can run forward and stepWith on those samples.

                                                          • sample : IO (TensorPack α inputShapes)

                                                            Produce the input sample used at logical optimizer step step.

                                                          Instances For
                                                            def NN.API.train.Advanced.StepBatchStream.fixed {α : Type} {inputShapes : List Spec.Shape} (x : TensorPack α inputShapes) :
                                                            StepBatchStream α inputShapes

                                                            Constant stream for fixed-batch overfit runs and fixed-sample training jobs.

                                                            Instances For
                                                              def NN.API.train.Advanced.StepBatchStream.ofFn {α : Type} {inputShapes : List Spec.Shape} (f : TensorPack α inputShapes) :
                                                              StepBatchStream α inputShapes

                                                              Build a stream from a pure step-indexed sample function.

                                                              Instances For
                                                                def NN.API.train.Advanced.StepBatchStream.cycle {α : Type} {inputShapes : List Spec.Shape} (xs : List (TensorPack α inputShapes)) (h : xs []) :
                                                                StepBatchStream α inputShapes

                                                                Cycle through a nonempty list of samples.

                                                                This adapter lets list-backed datasets use the step-stream trainer. The explicit nonempty proof keeps empty datasets from turning into silent modulo-by-zero behavior.

                                                                Instances For
                                                                  def NN.API.train.Advanced.withMode {σ τ : Spec.Shape} {task : Task σ τ} {α : Type} [Semantics.Scalar α] [DecidableEq Spec.Shape] {β : Type} (runner : Runner α task) (value : TorchLean.NN.Mode) (action : IO β) :
                                                                  IO β

                                                                  Run an action with the runner temporarily switched to value mode.

                                                                  This is useful for "evaluate on a validation set during training" in callback-based loops.

                                                                  Instances For

                                                                    Mean loss for an already-instantiated scalar module over a typed minibatch loader.

                                                                    This is the general streaming evaluation path used by the runtime examples. It is not CIFAR-specific: any supervised task whose loss module consumes [dim n σ, dim n τ] can use the same loader. The loader stores ordinary per-example samples (x : σ, y : τ); this definition asks Data.epoch for raw minibatches and calls Data.collateSupervised to build one shape-typed batch at a time.

                                                                    Two details are important for larger examples:

                                                                    • We force shuffle := false for evaluation so before/after metrics are deterministic.
                                                                    • We do not call Data.BatchLoader.batchDataset, because that would materialize every collated minibatch at once. Streaming keeps the same API usable for image, sequence, and scientific ML examples where the batch tensors are much larger than small tabular datasets.
                                                                    Instances For
                                                                      def NN.API.train.Advanced.meanLossBatchLoader {σ τ : Spec.Shape} {n : } {task : Task (Spec.Shape.dim n σ) (Spec.Shape.dim n τ)} {α : Type} [Semantics.Scalar α] [DecidableEq Spec.Shape] [ToString α] (runner : Runner α task) (loader : Data.BatchLoader α n σ τ) :
                                                                      IO α

                                                                      Mean loss over a typed minibatch loader through a train.Advanced.Runner.

                                                                      This is the runner-facing form of meanLossModuleLoader. Use it when the example is built around train.Advanced.run, task modes, and the proof-facing trainer abstraction. Use meanLossModuleLoader directly when the example has already instantiated a runtime TorchLean.Module.ScalarModule, which is the common fast path for CUDA examples.

                                                                      Instances For
                                                                        def NN.API.train.Advanced.accuracyOneHotBatchLoader {σ : Spec.Shape} {classes batch : } {task : Task (Spec.Shape.dim batch σ) (Spec.Shape.dim batch (Tensor.Shape.Vec classes))} {α : Type} [Semantics.Scalar α] [DecidableEq Spec.Shape] (runner : Runner α task) (loader : Data.BatchLoader α batch σ (Tensor.Shape.Vec classes)) :

                                                                        One-hot accuracy over a typed minibatch loader without materializing all collated batches.

                                                                        Instances For

                                                                          Train a runtime scalar module from a typed minibatch loader.

                                                                          This is the shared "real epoch loop" for model examples that already have a runtime module, including CUDA runs. It mirrors the PyTorch structure:

                                                                          1. create an optimizer state for the module parameters;
                                                                          2. for each epoch, ask the general Data.batchLoader for shuffled raw batches;
                                                                          3. collate each raw batch into a shape-typed (xBatch, yBatch) sample;
                                                                          4. report the scalar loss through callbacks;
                                                                          5. run forward/backward/optimizer.step through TorchLean.Module.stepWith.

                                                                          The function is polymorphic in the input shape σ, target shape τ, batch size n, scalar type α, parameter shapes, and optimizer. It is not image-specific. CNN, ResNet, ViT, MLP, sequence, operator-learning, and future model examples should all be able to use this path whenever their supervised loss module has input shapes [dim n σ, dim n τ].

                                                                          Instances For

                                                                            Train a runtime scalar module for exactly steps optimizer updates.

                                                                            trainModuleLoaderWith above is epoch-based: each unit means one full pass over the loader. This variant is update-based, which is the convention used by runnable examples that expose a --steps flag.

                                                                            The loop still draws shuffled minibatches from Data.batchLoader epoch by epoch, but it stops as soon as the requested number of optimizer updates has run. The returned loader is the advanced loader state, so callers can continue training from the next shuffled epoch if they want to.

                                                                            Instances For
                                                                              def NN.API.train.Advanced.trainModuleStreamStepsWith {inputShapes paramShapes : List Spec.Shape} {α : Type} [Semantics.Scalar α] [DecidableEq Spec.Shape] [ToString α] (module : Runtime.Autograd.TorchLean.ScalarModule α paramShapes inputShapes) (optimizer : Runtime.Autograd.TorchLean.Optimizer α paramShapes) (steps : ) (stream : StepBatchStream α inputShapes) (callbacks : Callbacks α := Callbacks.empty) :

                                                                              Train a scalar module from a step-indexed batch stream.

                                                                              This is the shared loop for workloads whose batches are produced step by step rather than by one finite Data.batchLoader epoch:

                                                                              • RL algorithms can sample replay or rollout batches,
                                                                              • PDE examples can resample collocation points,
                                                                              • generated workloads can stream synthetic inputs without storing a dataset.

                                                                              The function is generic in inputShapes. It does not know whether the sample is [x, y], [state, action, target], or []; it only asks the stream for the next typed input list and then runs the same forward/backward/optimizer.step machinery as the loader-based trainer.

                                                                              Instances For
                                                                                def NN.API.train.Advanced.trainModuleStreamStepsReport {inputShapes paramShapes : List Spec.Shape} {α : Type} [Semantics.Scalar α] [DecidableEq Spec.Shape] [ToString α] (module : Runtime.Autograd.TorchLean.ScalarModule α paramShapes inputShapes) (optimizer : Runtime.Autograd.TorchLean.Optimizer α paramShapes) (opts : Runtime.Autograd.Torch.Options) (steps : ) (stream : StepBatchStream α inputShapes) (cudaMemWatch : := 0) (extraCallbacks : Callbacks α := Callbacks.empty) :

                                                                                Report-oriented stream-training entrypoint.

                                                                                Callers pass the module, optimizer, runtime options, step count, and stream, and get standard before/after reporting plus CUDA memory watching.

                                                                                Instances For
                                                                                  def NN.API.train.Advanced.trainModuleStreamStepsCurveFloat {inputShapes paramShapes : List Spec.Shape} (module : Runtime.Autograd.TorchLean.ScalarModule Float paramShapes inputShapes) (optimizer : Runtime.Autograd.TorchLean.Optimizer Float paramShapes) (opts : Runtime.Autograd.Torch.Options) (steps : ) (stream : StepBatchStream Float inputShapes) (cudaMemWatch : := 0) (extraCallbacks : Callbacks Float := Callbacks.empty) :

                                                                                  Float stream trainer that records a per-step loss curve.

                                                                                  Generated and file-backed batches do not always have one finite loader to summarize. This entrypoint keeps their training curves in the same JSON format as the supervised examples.

                                                                                  Instances For

                                                                                    Train from a runner-backed loader with explicit callbacks instead of inline printing in example code.

                                                                                    This is the runner-facing public path for PyTorch-style custom loops:

                                                                                    • keep the optimizer/scheduler logic in the library,
                                                                                    • inject logging, evaluation, and prediction reporting through callbacks.

                                                                                    This path keeps the Runner abstraction, including task modes and scheduler support. For CUDA-heavy entrypoints that already have a TorchLean.Module.ScalarModule, prefer trainModuleLoaderWith; both paths consume the same general API.Data.batchLoader.

                                                                                    Instances For
                                                                                      def NN.API.train.Advanced.stepper {σ τ : Spec.Shape} {task : Task σ τ} {α : Type} [Semantics.Scalar α] [DecidableEq Spec.Shape] [ToString α] [Runtime.Scalar α] (runner : Runner α task) (optimizer : optim.Optimizer) (scheduler : Option TorchLean.Schedulers.Config := none) :
                                                                                      IO (Stepper α task)

                                                                                      Create a Stepper loop for a runner and optimizer (optionally with an LR scheduler).

                                                                                      This corresponds to the “inner training loop” state in typical PyTorch code: an optimizer state plus (optional) schedule state, ready to step on a batch.

                                                                                      Instances For
                                                                                        def NN.API.train.Advanced.step {σ τ : Spec.Shape} {task : Task σ τ} {α : Type} [Semantics.Scalar α] [DecidableEq Spec.Shape] (loop : Stepper α task) (sample : SupervisedSample α σ τ) :
                                                                                        IO α

                                                                                        Run one optimization step on a single supervised sample (one batch).

                                                                                        Instances For
                                                                                          def NN.API.train.Advanced.epoch {σ τ : Spec.Shape} {task : Task σ τ} {α : Type} [Semantics.Scalar α] [DecidableEq Spec.Shape] (loop : Stepper α task) (samples : List (SupervisedSample α σ τ)) :
                                                                                          IO (List α)

                                                                                          Run one epoch over a list of supervised samples, returning the per-step losses.

                                                                                          Instances For

                                                                                            Small Reporting Helpers (IO) #

                                                                                            These definitions factor out common "print a loss/accuracy table" patterns for runnable model commands. They do not affect semantics: they only call the underlying runner functions and print human-facing summaries. Public examples should reach them through Trainer.Advanced only when the ordinary Trainer.new / trainer.train surface is too small for the example.

                                                                                            def NN.API.train.Advanced.Report.reportProbes {β : Type} (title : String) (probes : List β) (lineOf : βIO String) :

                                                                                            Print a titled list of named report lines.

                                                                                            Instances For
                                                                                              def NN.API.train.Advanced.Report.reportMeanLoss {σ τ : Spec.Shape} {task : Task σ τ} {α : Type} [Semantics.Scalar α] [DecidableEq Spec.Shape] [ToString α] (runner : Runner α task) (dataset : Runtime.Autograd.Train.Dataset (SupervisedSample α σ τ)) (label : String) :

                                                                                              Convenience: mean loss on a dataset, printed with a label.

                                                                                              Instances For
                                                                                                def NN.API.train.Advanced.Report.reportMeanLossLoader {σ τ : Spec.Shape} {batch : } {task : Task (Spec.Shape.dim batch σ) (Spec.Shape.dim batch τ)} {α : Type} [Semantics.Scalar α] [DecidableEq Spec.Shape] [ToString α] (runner : Runner α task) (loader : Data.BatchLoader α batch σ τ) (label : String) :

                                                                                                Convenience: mean loss on a typed minibatch loader, streamed batch by batch.

                                                                                                Instances For
                                                                                                  def NN.API.train.Advanced.Report.reportMeanLossModuleLoader {σ τ : Spec.Shape} {batch : } {paramShapes : List Spec.Shape} {α : Type} [Semantics.Scalar α] [DecidableEq Spec.Shape] [ToString α] (module : Runtime.Autograd.TorchLean.ScalarModule α paramShapes [Spec.Shape.dim batch σ, Spec.Shape.dim batch τ]) (loader : Data.BatchLoader α batch σ τ) (label : String) :

                                                                                                  Convenience: mean loss on a typed minibatch loader for an already-instantiated runtime module.

                                                                                                  Use this in direct CUDA/runtime examples to avoid building a Runner only for logging. The data path is still the same public loader path: Data.batchLoader plus Data.collateSupervised.

                                                                                                  Instances For
                                                                                                    def NN.API.train.Advanced.Report.reportClassProbes {σ : Spec.Shape} {classes : } {task : Task σ (Spec.Shape.dim classes Spec.Shape.scalar)} {α : Type} [Semantics.Scalar α] [DecidableEq Spec.Shape] [ToString α] (runner : Runner α task) (probes : List (String × Spec.Tensor α σ × )) (title : String := "predictions") (includeLogits : Bool := false) :

                                                                                                    Report predicted classes on a list of named inputs.

                                                                                                    Each entry is (name, x, expectedClass). If includeLogits := true, also prints the raw model outputs.

                                                                                                    Instances For
                                                                                                      def NN.API.train.Advanced.Report.reportClassProbesBatchedFromSingle {σ : Spec.Shape} {classes batch : } {task : Task (Spec.Shape.dim batch σ) (Spec.Shape.dim batch (Tensor.Shape.Vec classes))} {α : Type} [Semantics.Scalar α] [DecidableEq Spec.Shape] [ToString α] (runner : Runner α task) (probes : List (String × Spec.Tensor α σ × )) (title : String := "predictions") (includeLogits : Bool := false) :

                                                                                                      Report predicted classes on a list of named inputs, for a batched model.

                                                                                                      This expects inputs of the unbatched input shape σ and replicates each one across the batch axis, then reports the prediction for row 0.

                                                                                                      Instances For

                                                                                                        Convenience: mean loss + one-hot accuracy on a dataset, printed with a label.

                                                                                                        Instances For
                                                                                                          def NN.API.train.Advanced.Report.reportLossAccuracyOneHotBatched {σ : Spec.Shape} {classes batch : } {task : Task (Spec.Shape.dim batch σ) (Spec.Shape.dim batch (Tensor.Shape.Vec classes))} {α : Type} [Semantics.Scalar α] [DecidableEq Spec.Shape] [ToString α] (runner : Runner α task) (dataset : Runtime.Autograd.Train.Dataset (sample.Batch α batch σ (Tensor.Shape.Vec classes))) (label : String) :

                                                                                                          Batched variant of reportLossAccuracyOneHot.

                                                                                                          Instances For
                                                                                                            def NN.API.train.Advanced.Report.reportLossAccuracyOneHotLoader {σ : Spec.Shape} {classes batch : } {task : Task (Spec.Shape.dim batch σ) (Spec.Shape.dim batch (Tensor.Shape.Vec classes))} {α : Type} [Semantics.Scalar α] [DecidableEq Spec.Shape] [ToString α] (runner : Runner α task) (loader : Data.BatchLoader α batch σ (Tensor.Shape.Vec classes)) (label : String) :

                                                                                                            Loader variant of reportLossAccuracyOneHotBatched, streaming through minibatches.

                                                                                                            Instances For
                                                                                                              def NN.API.train.Advanced.trainModuleLoaderStepsReport {σ τ : Spec.Shape} {n : } {paramShapes : List Spec.Shape} {α : Type} [Semantics.Scalar α] [DecidableEq Spec.Shape] [ToString α] (module : Runtime.Autograd.TorchLean.ScalarModule α paramShapes [Spec.Shape.dim n σ, Spec.Shape.dim n τ]) (optimizer : Runtime.Autograd.TorchLean.Optimizer α paramShapes) (opts : Runtime.Autograd.Torch.Options) (steps : ) (loader : Data.BatchLoader α n σ τ) (cudaMemWatch : := 0) (extraCallbacks : Callbacks α := Callbacks.empty) :

                                                                                                              Train a runtime module for a fixed number of optimizer updates with the standard runtime reports.

                                                                                                              This is the common path for direct-module training, not example-only code. It composes the generic step loop with before/after mean-loss reporting and CUDA allocator telemetry, while still accepting extra callbacks for projects that want their own metrics, validation, or tracing.

                                                                                                              Instances For

                                                                                                                Float-specialized module training that also records a scalar loss curve.

                                                                                                                The training loop itself is the same as trainModuleLoaderStepsReport; this entrypoint adds the standard Curve callback used by JSON logs and website widgets.

                                                                                                                Instances For
                                                                                                                  def NN.API.train.Advanced.trainModuleLoaderStepsLoggedFloat {σ τ : Spec.Shape} {n : } {paramShapes : List Spec.Shape} (module : Runtime.Autograd.TorchLean.ScalarModule Float paramShapes [Spec.Shape.dim n σ, Spec.Shape.dim n τ]) (optimizer : Runtime.Autograd.TorchLean.Optimizer Float paramShapes) (opts : Runtime.Autograd.Torch.Options) (steps : ) (loader : Data.BatchLoader Float n σ τ) (log : LogDestination) (title : String) (notes : Array String := #[]) (seriesName : String := "loss") (cudaMemWatch : := 0) :

                                                                                                                  Train a Float runtime module, write a standard scalar-curve log, and return the train report.

                                                                                                                  This is the high-level path used by runnable training commands. The caller provides the model, optimizer, loader, runtime options, and metadata notes; the library owns the callback composition, CUDA telemetry, before/after reports, and JSON curve emission.

                                                                                                                  Instances For