Public autograd operations #

This module contains the public gradient, VJP, Jacobian, JVP, and HVP APIs for models and pure one-argument tensor functions.

Autograd operations (grad/vjp/jacobian) over TorchLean programs.

This namespace is conceptually similar to PyTorch autograd + functorch/torch.func:

gradients of losses w.r.t. parameters and inputs
VJPs and Jacobians for analysis and verification tooling

PyTorch references:

Autograd: https://pytorch.org/docs/stable/autograd.html
torch.func (jacfwd/jacrev, etc.): https://pytorch.org/docs/stable/func.html

source

@[reducible, inline]

abbrev NN.API.autograd.model.Params {σ τ : Spec.Shape} (model : TorchLean.NN.Seq σ τ) (α : Type) :

Type

Parameter pack type for a given model (a TensorPack over Seq.paramShapes).

Instances For

source

@[reducible, inline]

abbrev NN.API.autograd.model.OutputLoss (τ υ : Spec.Shape) :

Type 1

Loss function over a model output and a target.

This is expressed in terms of RefTy so it works uniformly for eager execution and compiled execution.

Instances For

source

@[reducible, inline]

abbrev NN.API.autograd.model.linearParams {α : Type} {inDim outDim seedW seedB : ℕ} (w : Spec.Tensor α (Tensor.Shape.Mat outDim inDim)) (b : Spec.Tensor α (Tensor.Shape.Vec outDim)) :

Params (TorchLean.Layers.linear inDim outDim seedW seedB) α

Pack explicit weight and bias tensors for a single Layers.linear model.

Instances For

source

@[reducible, inline]

abbrev NN.API.autograd.model.OutputLoss.mse {τ : Spec.Shape} (reduction : TorchLean.Loss.Reduction := Runtime.Autograd.TorchLean.Loss.Reduction.mean) :

OutputLoss τ τ

Mean-squared error loss (mse) between yhat and y.

Instances For

source

@[reducible, inline]

abbrev NN.API.autograd.model.OutputLoss.crossEntropyOneHot {τ : Spec.Shape} (reduction : TorchLean.Loss.Reduction := Runtime.Autograd.TorchLean.Loss.Reduction.mean) :

OutputLoss τ τ

Cross-entropy loss between logits and one-hot targets. PyTorch analogue: nn.CrossEntropyLoss.

Instances For

source

@[reducible, inline]

abbrev NN.API.autograd.model.OutputLoss.detach {τ υ : Spec.Shape} (loss : OutputLoss τ υ) :

OutputLoss τ υ

Detach the model output before feeding it into a loss.

This is useful when you want to compute a metric loss without backpropagating through it.

Instances For

source

def NN.API.autograd.model.gradParams {σ τ υ : Spec.Shape} (model : TorchLean.NN.Seq σ τ) (loss : TorchLean.Autodiff.Model.OutputLoss τ υ) {α : Type} [Semantics.Scalar α] [DecidableEq Spec.Shape] (params : TorchLean.Autodiff.Model.Params model α) (x : Spec.Tensor α σ) (target : Spec.Tensor α υ) :

IO (TorchLean.Autodiff.Model.Params model α)

Gradient of a model-loss w.r.t. the model parameters.

This is the common training use case (PyTorch analogue: loss.backward() followed by parameter updates).

Instances For

source

def NN.API.autograd.model.gradInputs {σ τ υ : Spec.Shape} (model : TorchLean.NN.Seq σ τ) (loss : TorchLean.Autodiff.Model.OutputLoss τ υ) {α : Type} [Semantics.Scalar α] [DecidableEq Spec.Shape] (params : TorchLean.Autodiff.Model.Params model α) (x : Spec.Tensor α σ) (target : Spec.Tensor α υ) :

IO (TensorPack α [σ, υ])

Gradient of the loss w.r.t. the inputs (x and target).

Instances For

source

def NN.API.autograd.model.gradX {σ τ υ : Spec.Shape} (model : TorchLean.NN.Seq σ τ) (loss : TorchLean.Autodiff.Model.OutputLoss τ υ) {α : Type} [Semantics.Scalar α] [DecidableEq Spec.Shape] (params : TorchLean.Autodiff.Model.Params model α) (x : Spec.Tensor α σ) (target : Spec.Tensor α υ) :

IO (Spec.Tensor α σ)

Convenience: gradient of the loss w.r.t. x.

Instances For

source

def NN.API.autograd.model.gradTarget {σ τ υ : Spec.Shape} (model : TorchLean.NN.Seq σ τ) (loss : TorchLean.Autodiff.Model.OutputLoss τ υ) {α : Type} [Semantics.Scalar α] [DecidableEq Spec.Shape] (params : TorchLean.Autodiff.Model.Params model α) (x : Spec.Tensor α σ) (target : Spec.Tensor α υ) :

IO (Spec.Tensor α υ)

Convenience: gradient of the loss w.r.t. the target argument.

Instances For

source

structure NN.API.autograd.model.ValueAndGrads {σ τ υ : Spec.Shape} (model : TorchLean.NN.Seq σ τ) (α : Type) :

Type

Forward+backward result for a scalar loss built from a model output.

PyTorch comparison: this is the "compute loss + backward" payload, but with shapes tracked.

value : Spec.Tensor α Spec.Shape.scalar
Value at the current point.
dparams : TorchLean.Autodiff.Model.Params model α
Gradients w.r.t. parameters.
dx : Spec.Tensor α σ
Gradient w.r.t. input.
dtarget : Spec.Tensor α υ
Gradient w.r.t. target.

Instances For

source

def NN.API.autograd.model.valueAndGrads {σ τ υ : Spec.Shape} (model : TorchLean.NN.Seq σ τ) (loss : TorchLean.Autodiff.Model.OutputLoss τ υ) {α : Type} [Semantics.Scalar α] [DecidableEq Spec.Shape] (params : TorchLean.Autodiff.Model.Params model α) (x : Spec.Tensor α σ) (target : Spec.Tensor α υ) :

IO (ValueAndGrads model α)

Run loss(model(params, x), target) and compute gradients w.r.t:

model parameters,
x,
target.

This hides the CompiledScalar/argument-pack boilerplate for the common "one sample" case.

Instances For

source

def NN.API.autograd.model.valueAndGradParams {σ τ υ : Spec.Shape} (model : TorchLean.NN.Seq σ τ) (loss : TorchLean.Autodiff.Model.OutputLoss τ υ) {α : Type} [Semantics.Scalar α] [DecidableEq Spec.Shape] (params : TorchLean.Autodiff.Model.Params model α) (x : Spec.Tensor α σ) (target : Spec.Tensor α υ) :

IO (Spec.Tensor α Spec.Shape.scalar × TorchLean.Autodiff.Model.Params model α)

Return the scalar loss tensor together with gradients for the model parameters.

Instances For

source

def NN.API.autograd.model.valueAndGradParamsScalar {σ τ υ : Spec.Shape} (model : TorchLean.NN.Seq σ τ) (loss : TorchLean.Autodiff.Model.OutputLoss τ υ) {α : Type} [Semantics.Scalar α] [DecidableEq Spec.Shape] (params : TorchLean.Autodiff.Model.Params model α) (x : Spec.Tensor α σ) (target : Spec.Tensor α υ) :

IO (α × TorchLean.Autodiff.Model.Params model α)

valueAndGradParams, but convert the 0-dim loss tensor to a scalar α.

Instances For

source

def NN.API.autograd.model.valueAndGradX {σ τ υ : Spec.Shape} (model : TorchLean.NN.Seq σ τ) (loss : TorchLean.Autodiff.Model.OutputLoss τ υ) {α : Type} [Semantics.Scalar α] [DecidableEq Spec.Shape] (params : TorchLean.Autodiff.Model.Params model α) (x : Spec.Tensor α σ) (target : Spec.Tensor α υ) :

IO (Spec.Tensor α Spec.Shape.scalar × Spec.Tensor α σ)

Return (loss_value, grad_x).

Instances For

source

def NN.API.autograd.model.valueAndGradTarget {σ τ υ : Spec.Shape} (model : TorchLean.NN.Seq σ τ) (loss : TorchLean.Autodiff.Model.OutputLoss τ υ) {α : Type} [Semantics.Scalar α] [DecidableEq Spec.Shape] (params : TorchLean.Autodiff.Model.Params model α) (x : Spec.Tensor α σ) (target : Spec.Tensor α υ) :

IO (Spec.Tensor α Spec.Shape.scalar × Spec.Tensor α υ)

Return (loss_value, grad_target).

Instances For

source

def NN.API.autograd.model.vjpParams {σ τ : Spec.Shape} (model : TorchLean.NN.Seq σ τ) {α : Type} [Semantics.Scalar α] [DecidableEq Spec.Shape] (params : TorchLean.Autodiff.Model.Params model α) (x : Spec.Tensor α σ) (seedOut : Spec.Tensor α τ) :

IO (TorchLean.Autodiff.Model.Params model α)

Vector-Jacobian product (VJP) w.r.t. model parameters.

This is the "grad of outputs back into parameters" primitive. It is useful for custom losses or analysis tooling when you already have a seed tensor seedOut : τ.

Instances For

source

def NN.API.autograd.model.vjpInputs {σ τ : Spec.Shape} (model : TorchLean.NN.Seq σ τ) {α : Type} [Semantics.Scalar α] [DecidableEq Spec.Shape] (params : TorchLean.Autodiff.Model.Params model α) (x : Spec.Tensor α σ) (seedOut : Spec.Tensor α τ) :

IO (TensorPack α [σ])

VJP w.r.t. the model input.

This returns a one-element TensorPack to match the general "inputs list" API shape. For the common case, use vjpInput to get the tensor directly.

Instances For

source

def NN.API.autograd.model.vjpInput {σ τ : Spec.Shape} (model : TorchLean.NN.Seq σ τ) {α : Type} [Semantics.Scalar α] [DecidableEq Spec.Shape] (params : TorchLean.Autodiff.Model.Params model α) (x : Spec.Tensor α σ) (seedOut : Spec.Tensor α τ) :

IO (Spec.Tensor α σ)

Vector-Jacobian product with respect to the single model input tensor.

Instances For

source

def NN.API.autograd.model.jacrevParams {σ τ : Spec.Shape} (model : TorchLean.NN.Seq σ τ) {α : Type} [Semantics.Scalar α] [DecidableEq Spec.Shape] (params : TorchLean.Autodiff.Model.Params model α) (x : Spec.Tensor α σ) :

IO (Array (TorchLean.Autodiff.Model.Params model α))

Reverse-mode Jacobian (jacrev) of the model output w.r.t. parameters.

Returns an array of parameter-structured gradients: one entry per output coordinate. This mirrors the usual "jacrev returns a stack of per-output gradients" shape.

Instances For

source