TorchLean API

NN.Runtime.Autograd.Train.Core

Core training helpers #

These are small, reusable utilities that keep training scripts short and readable. They are lightweight and pure.

Prefix an error message with a caller-provided tag.

This is used throughout the training helpers to keep error messages readable when multiple subsystems can fail (tape execution, dataset loading, shape checks, etc.).

Instances For

    Typed extraction helpers #

    The autograd engine stores values and gradients as Runtime.AnyTensor (shape tag + tensor). These helpers check shapes and give you back a typed tensor or a scalar value.

    Read a typed gradient tensor Tensor a s from a gradient map keyed by node id.

    The eager tape engine stores gradients in a shape-erased form (Runtime.AnyTensor a), so this helper performs the dynamic shape check and returns a typed tensor on success.

    Instances For

      Read a typed forward value Tensor a s from a tape node id.

      This is the value-side analogue of requireGradTensor: it performs a dynamic shape check on the shape tag stored in AnyTensor.

      Instances For

        Read a scalar forward value from a tape node id.

        This is a common pattern in training scripts where the loss is a scalar node.

        Instances For

          SGD update helper #

          This is a small tensor-level update used by many tests.

          def Runtime.Autograd.Train.sgdUpdateTensor {a : Type} [Sub a] [Mul a] {s : Spec.Shape} (param grad : Spec.Tensor a s) (lr : a) :

          Single-tensor SGD update rule.

          Given a parameter tensor param, its gradient tensor grad, and a learning rate lr, compute:

          param - lr * grad.

          This is plain SGD (no momentum, weight decay, etc.); higher-level optimizers live in NN.Runtime.Autograd.Train.Optim.

          Instances For
            theorem Runtime.Autograd.Train.sgdUpdateTensor_eq_optimSGD {a : Type} [Context a] [DecidableRel fun (x1 x2 : a) => x1 > x2] {s : Spec.Shape} (param grad : Spec.Tensor a s) (lr : a) :
            sgdUpdateTensor param grad lr = Optim.SGD.update { lr := lr } param grad

            The lightweight SGD helper is the same formula as the canonical pure optimizer.

            We keep sgdUpdateTensor because it has a small algebraic signature (Sub/Mul) that is convenient in tests, but this theorem pins it to the canonical optimizer equation used by the runtime optimizer stack.