Core training helpers #

These are small, reusable utilities that keep training scripts short and readable. They are lightweight and pure.

def Runtime.Autograd.Train.tagError (tag msg : String) :

Prefix an error message with a caller-provided tag.

This is used throughout the training helpers to keep error messages readable when multiple subsystems can fail (tape execution, dataset loading, shape checks, etc.).

Instances For

Typed extraction helpers #

The autograd engine stores values and gradients as Runtime.AnyTensor (shape tag + tensor). These helpers check shapes and give you back a typed tensor or a scalar value.

source

def Runtime.Autograd.Train.requireGradTensor {a : Type} [DecidableEq Spec.Shape] {s : Spec.Shape} (tag : String) (grads : Std.HashMap ℕ (AnyTensor a)) (id : ℕ) :

Result (Spec.Tensor a s)

Read a typed gradient tensor Tensor a s from a gradient map keyed by node id.

The eager tape engine stores gradients in a shape-erased form (Runtime.AnyTensor a), so this helper performs the dynamic shape check and returns a typed tensor on success.

Instances For

source

def Runtime.Autograd.Train.requireValueTensor {a : Type} [DecidableEq Spec.Shape] {s : Spec.Shape} (tag : String) (t : Tape a) (id : ℕ) :

Result (Spec.Tensor a s)

Read a typed forward value Tensor a s from a tape node id.

This is the value-side analogue of requireGradTensor: it performs a dynamic shape check on the shape tag stored in AnyTensor.

Instances For

source

def Runtime.Autograd.Train.requireScalarValue {a : Type} [DecidableEq Spec.Shape] (tag : String) (t : Tape a) (id : ℕ) :

Result a

Read a scalar forward value from a tape node id.

This is a common pattern in training scripts where the loss is a scalar node.

Instances For

SGD update helper #

This is a small tensor-level update used by many tests.

source

def Runtime.Autograd.Train.sgdUpdateTensor {a : Type} [Sub a] [Mul a] {s : Spec.Shape} (param grad : Spec.Tensor a s) (lr : a) :

Spec.Tensor a s

Single-tensor SGD update rule.

Given a parameter tensor param, its gradient tensor grad, and a learning rate lr, compute:

param - lr * grad.

This is plain SGD (no momentum, weight decay, etc.); higher-level optimizers live in NN.Runtime.Autograd.Train.Optim.

Instances For

source

theorem Runtime.Autograd.Train.sgdUpdateTensor_eq_optimSGD {a : Type} [Context a] [DecidableRel fun (x1 x2 : a) => x1 > x2] {s : Spec.Shape} (param grad : Spec.Tensor a s) (lr : a) :

sgdUpdateTensor param grad lr = Optim.SGD.update { lr := lr } param grad

The lightweight SGD helper is the same formula as the canonical pure optimizer.

We keep sgdUpdateTensor because it has a small algebraic signature (Sub/Mul) that is convenient in tests, but this theorem pins it to the canonical optimizer equation used by the runtime optimizer stack.

TorchLean API

NN.Runtime.Autograd.Train.Core

Core training helpers #

Typed extraction helpers #

SGD update helper #