Torch Trainer Helpers #

Ops instances, parameter lists, and scalar trainer construction for the Torch-style runtime. This is the bridge from backend-generic model code to executable training loops.

source

@[reducible, inline]

abbrev Runtime.Autograd.Torch.Internal.EagerM (α : Type) :

Type → Type

Monad used for the eager Ops instance: read an Internal.EagerSession α and execute in IO.

This is the backend that makes Ops programs execute immediately by mutating a hidden runtime tape.

Instances For

source

@[implicit_reducible]

instance Runtime.Autograd.Torch.instOpsEagerMOfTensorConv {α : Type} [Context α] [Internal.CudaBridge.TensorConv α] [DecidableEq Spec.Shape] :

Ops (Internal.EagerM α) α

Ops instance for the eager Torch-style runtime.

This interprets Ops primitives by immediately executing them against the hidden mutable tape in the current Internal.EagerSession.

source

@[implicit_reducible]

instance Runtime.Autograd.Torch.instOpsM {α : Type} [Context α] [DecidableEq Spec.Shape] {Γ : List Spec.Shape} :

Ops (Compiled.GraphM.M α Γ) α

Ops instance for the compiled graph-building monad GraphM.

This interprets Ops primitives by recording typed IR nodes (rather than executing immediately). See Runtime.Autograd.Compiled.GraphM and Torch.LinkedSession for how these graphs are later run.

source

inductive Runtime.Autograd.Torch.ParamList (α : Type) :

List Spec.Shape → Type

Heterogeneous list of trainable parameters, indexed by a list of shapes.

This is the Torch front-end analogue of "a parameter vector" (like model.parameters() in PyTorch), but with shapes tracked at the type level.

nil {α : Type} : ParamList α []
cons {α : Type} {s : Spec.Shape} {ss : List Spec.Shape} : Param α s → ParamList α ss → ParamList α (s :: ss)

Instances For

source

def Runtime.Autograd.Torch.ParamList.subScaleMaterialize {α : Type} [Sub α] [Mul α] {s : Spec.Shape} :

Spec.Tensor α s → Spec.Tensor α s → α → Spec.Tensor α s

Materialize the SGD update v - lr * g in a single traversal.

This is used by sgdStep_fast as a runtime-performance optimization to avoid building deep thunk chains when training for many steps.

Instances For

source

def Runtime.Autograd.Torch.ParamList.ofTList {α : Type} {ss : List Spec.Shape} (xs : TList α ss) :

IO (ParamList α ss)

Allocate a fresh ParamList from an initial TList of parameter tensors.

Each tensor becomes an IO.Ref so it can be updated by optimizer steps.

Instances For

source

def Runtime.Autograd.Torch.ParamList.ofTListWithRequiresGrad {α : Type} {ss : List Spec.Shape} :

TList α ss → List Bool → IO (ParamList α ss)

Allocate a fresh ParamList from an initial TList of parameter tensors, with explicit requiresGrad flags.

Returns an error when the flag list length does not match the parameter shape list length.

Instances For

source

def Runtime.Autograd.Torch.ParamList.values {α : Type} {ss : List Spec.Shape} :

ParamList α ss → IO (TList α ss)

Read the current parameter values as a TList aligned with the shape list.

Instances For

source

def Runtime.Autograd.Torch.ParamList.valuesSynced {α : Type} [Internal.CudaBridge.TensorConv α] [DecidableEq Spec.Shape] {ss : List Spec.Shape} :

ParamList α ss → IO (TList α ss)

Read parameter values, synchronizing CUDA-resident mirrors first when necessary.

Instances For

source

def Runtime.Autograd.Torch.ParamList.setValues {α : Type} {ss : List Spec.Shape} :

ParamList α ss → TList α ss → IO Unit

Overwrite the current parameter values from a TList aligned with the shape list.

Instances For

source

def Runtime.Autograd.Torch.ParamList.sgdStep {α : Type} [Context α] {ss : List Spec.Shape} :

ParamList α ss → (lr : α) → TList α ss → IO Unit

Apply an SGD step p := p - lr * g to each parameter that has requiresGrad = true.

gs must be aligned with the parameter shapes.

Instances For

source

def Runtime.Autograd.Torch.ParamList.sgdStepFast {α : Type} [Context α] {ss : List Spec.Shape} :

ParamList α ss → (lr : α) → TList α ss → IO Unit

Like sgdStep, but uses a fully materialized update (subScaleMaterialize) for speed.

This is a runtime performance knob; mathematically it is equivalent to sgdStep.

Instances For

source

structure Runtime.Autograd.Torch.ScalarTrainer (α : Type) (paramShapes inputShapes : List Spec.Shape) :

Type

Bundle a scalar-loss training loop for a fixed parameter pack and input signature.

This is the low-level trainer object used by module-backed execution:

forward computes a scalar loss,
backward computes gradients w.r.t. parameters,
step applies an optimizer update (typically SGD),
getParams reads current parameter values.

params : ParamList α paramShapes
Mutable trainable parameter pack.
forward : Curried.Fn α inputShapes (IO (Spec.Tensor α Spec.Shape.scalar))
Compute the scalar loss for a curried input pack.
backward : Curried.Fn α inputShapes (IO (TList α paramShapes))
Compute gradients aligned with paramShapes for a curried input pack.
step : α → Curried.Fn α inputShapes (IO Unit)
Apply one SGD-style update for a curried input pack.
adamStep? : Option (α → α → α → α → Curried.Fn α inputShapes (IO Unit))
Optional Adam update path.
In eager CUDA mode this is a device-gradient/device-moment update path. Other backends expose none and should use the generic optimizer wrappers.
adamWStep? : Option (α → α → α → α → α → Curried.Fn α inputShapes (IO Unit))
Optional AdamW update path.
In eager CUDA mode this is a device-gradient/device-moment update path with decoupled weight decay. Other backends expose none and should use the generic optimizer wrappers.
getParams : IO (TList α paramShapes)
Read current parameter values, synchronizing device mirrors if needed.

Instances For

source

def Runtime.Autograd.Torch.Internal.gradsOfRefs {α : Type} [DecidableEq Spec.Shape] {ss : List Spec.Shape} :

Array (AnyTensor α) → RefList (TensorRef α) ss → IO (TList α ss)

Extract gradients (as a typed TList) for a list of eager TensorRefs from a dense gradient array.

Instances For

source

def Runtime.Autograd.Torch.Internal.useParams {α : Type} [CudaBridge.TensorConv α] [DecidableEq Spec.Shape] {ss : List Spec.Shape} :

ParamList α ss → EagerM α (RefList (TensorRef α) ss)

Record all parameters as tape leaves in an eager session, returning their corresponding TensorRefs.

This is the eager analogue of "using" a parameter pack during a forward pass.

Instances For

source

def Runtime.Autograd.Torch.Internal.useInputs {α : Type} [CudaBridge.TensorConv α] [DecidableEq Spec.Shape] {ss : List Spec.Shape} :

TList α ss → EagerM α (RefList (TensorRef α) ss)

Record all input tensors as tape leaves in an eager session, returning their corresponding TensorRefs.

Instances For

source

def Runtime.Autograd.Torch.scalarTrainer {α : Type} [Context α] [Internal.CudaBridge.TensorConv α] [DecidableEq Spec.Shape] {paramShapes inputShapes : List Spec.Shape} (opts : Options := { }) (initRequiresGrad : List Bool := List.replicate paramShapes.length true) (loss : {m : Type → Type} → [Monad m] → [inst : Ops m α] → CurriedRef (fun (s : Spec.Shape) => Ops.Ref m α s) (paramShapes ++ inputShapes) (m (Ops.Ref m α Spec.Shape.scalar))) :

Curried.Fn α paramShapes (IO (ScalarTrainer α paramShapes inputShapes))

Build a ScalarTrainer from an initial parameter pack and a backend-generic loss definition.

loss is written once against the Ops interface over a concatenated context paramShapes ++ inputShapes. Depending on opts.backend, we either:

compile the loss once (compiled backend), or
execute it eagerly by building a runtime tape each step (eager backend).

Instances For

source

def Runtime.Autograd.Torch.scalarTrainer.gradsPrefix {α : Type} [DecidableEq Spec.Shape] {ss : List Spec.Shape} :

Array (AnyTensor α) → ℕ → IO (TList α ss)

Instances For

TorchLean API

NN.Runtime.Autograd.Torch.Core.Trainer

Torch Trainer Helpers #