TorchLean API

NN.API.Public.Autograd.Core

Public autograd operations #

This module contains the public gradient, VJP, Jacobian, JVP, and HVP APIs for models and pure one-argument tensor functions.

Autograd operations (grad/vjp/jacobian) over TorchLean programs.

This namespace is conceptually similar to PyTorch autograd + functorch/torch.func:

PyTorch references:

@[reducible, inline]
abbrev NN.API.autograd.model.Params {σ τ : Spec.Shape} (model : TorchLean.NN.Seq σ τ) (α : Type) :

Parameter pack type for a given model (a TensorPack over Seq.paramShapes).

Instances For
    @[reducible, inline]

    Loss function over a model output and a target.

    This is expressed in terms of RefTy so it works uniformly for eager execution and compiled execution.

    Instances For
      @[reducible, inline]
      abbrev NN.API.autograd.model.linearParams {α : Type} {inDim outDim seedW seedB : } (w : Spec.Tensor α (Tensor.Shape.Mat outDim inDim)) (b : Spec.Tensor α (Tensor.Shape.Vec outDim)) :
      Params (TorchLean.Layers.linear inDim outDim seedW seedB) α

      Pack explicit weight and bias tensors for a single Layers.linear model.

      Instances For
        @[reducible, inline]

        Mean-squared error loss (mse) between yhat and y.

        Instances For
          @[reducible, inline]

          Cross-entropy loss between logits and one-hot targets. PyTorch analogue: nn.CrossEntropyLoss.

          Instances For
            @[reducible, inline]

            Detach the model output before feeding it into a loss.

            This is useful when you want to compute a metric loss without backpropagating through it.

            Instances For

              Gradient of a model-loss w.r.t. the model parameters.

              This is the common training use case (PyTorch analogue: loss.backward() followed by parameter updates).

              Instances For
                def NN.API.autograd.model.gradInputs {σ τ υ : Spec.Shape} (model : TorchLean.NN.Seq σ τ) (loss : TorchLean.Autodiff.Model.OutputLoss τ υ) {α : Type} [Semantics.Scalar α] [DecidableEq Spec.Shape] (params : TorchLean.Autodiff.Model.Params model α) (x : Spec.Tensor α σ) (target : Spec.Tensor α υ) :
                IO (TensorPack α [σ, υ])

                Gradient of the loss w.r.t. the inputs (x and target).

                Instances For
                  def NN.API.autograd.model.gradX {σ τ υ : Spec.Shape} (model : TorchLean.NN.Seq σ τ) (loss : TorchLean.Autodiff.Model.OutputLoss τ υ) {α : Type} [Semantics.Scalar α] [DecidableEq Spec.Shape] (params : TorchLean.Autodiff.Model.Params model α) (x : Spec.Tensor α σ) (target : Spec.Tensor α υ) :
                  IO (Spec.Tensor α σ)

                  Convenience: gradient of the loss w.r.t. x.

                  Instances For

                    Convenience: gradient of the loss w.r.t. the target argument.

                    Instances For
                      structure NN.API.autograd.model.ValueAndGrads {σ τ υ : Spec.Shape} (model : TorchLean.NN.Seq σ τ) (α : Type) :

                      Forward+backward result for a scalar loss built from a model output.

                      PyTorch comparison: this is the "compute loss + backward" payload, but with shapes tracked.

                      Instances For

                        Run loss(model(params, x), target) and compute gradients w.r.t:

                        • model parameters,
                        • x,
                        • target.

                        This hides the CompiledScalar/argument-pack boilerplate for the common "one sample" case.

                        Instances For

                          Return the scalar loss tensor together with gradients for the model parameters.

                          Instances For

                            valueAndGradParams, but convert the 0-dim loss tensor to a scalar α.

                            Instances For

                              Return (loss_value, grad_x).

                              Instances For

                                Return (loss_value, grad_target).

                                Instances For

                                  Vector-Jacobian product (VJP) w.r.t. model parameters.

                                  This is the "grad of outputs back into parameters" primitive. It is useful for custom losses or analysis tooling when you already have a seed tensor seedOut : τ.

                                  Instances For
                                    def NN.API.autograd.model.vjpInputs {σ τ : Spec.Shape} (model : TorchLean.NN.Seq σ τ) {α : Type} [Semantics.Scalar α] [DecidableEq Spec.Shape] (params : TorchLean.Autodiff.Model.Params model α) (x : Spec.Tensor α σ) (seedOut : Spec.Tensor α τ) :

                                    VJP w.r.t. the model input.

                                    This returns a one-element TensorPack to match the general "inputs list" API shape. For the common case, use vjpInput to get the tensor directly.

                                    Instances For
                                      def NN.API.autograd.model.vjpInput {σ τ : Spec.Shape} (model : TorchLean.NN.Seq σ τ) {α : Type} [Semantics.Scalar α] [DecidableEq Spec.Shape] (params : TorchLean.Autodiff.Model.Params model α) (x : Spec.Tensor α σ) (seedOut : Spec.Tensor α τ) :
                                      IO (Spec.Tensor α σ)

                                      Vector-Jacobian product with respect to the single model input tensor.

                                      Instances For

                                        Reverse-mode Jacobian (jacrev) of the model output w.r.t. parameters.

                                        Returns an array of parameter-structured gradients: one entry per output coordinate. This mirrors the usual "jacrev returns a stack of per-output gradients" shape.

                                        Instances For
                                          def NN.API.autograd.model.jvpParams {σ τ υ : Spec.Shape} (model : TorchLean.NN.Seq σ τ) (loss : TorchLean.Autodiff.Model.OutputLoss τ υ) {α : Type} [Semantics.Scalar α] [DecidableEq Spec.Shape] (params : TorchLean.Autodiff.Model.Params model α) (x : Spec.Tensor α σ) (target : Spec.Tensor α υ) (vparams : TorchLean.Autodiff.Model.Params model α) :
                                          IO α

                                          Jacobian-vector product (JVP) of a scalar loss w.r.t. parameters.

                                          This is the directional derivative in the direction vparams. Conceptually: d/dt loss(params + t*vparams, x, target) | t = 0.

                                          Instances For

                                            Hessian-vector product (HVP) of a scalar loss w.r.t. parameters.

                                            Returns a parameter-structured tensor list of the same shape as params.

                                            Instances For

                                              In PyTorch terms, this is the "functorch" style: differentiate plain functions, not modules.

                                              @[reducible, inline]

                                              Type of a pure tensor function expressed in RefTy form.

                                              This matches the calling convention expected by TorchLean.Program/autodiff compilation.

                                              Instances For

                                                Forward-mode Jacobian (jacfwd) for a pure tensor function.

                                                Instances For

                                                  Hessian for a scalar-valued function.

                                                  Instances For

                                                    Vector-Jacobian product (VJP) for a pure function.

                                                    Instances For

                                                      Reverse-mode Jacobian (jacrev) of a pure tensor function.

                                                      Returns the Jacobian rows as an array of doutput/dinput tensors.

                                                      Instances For

                                                        Gradient of a scalar-valued function w.r.t. its input.

                                                        Instances For

                                                          Return (value, grad) for a scalar-valued function at x.

                                                          Instances For

                                                            valueAndGrad, but convert the 0-dim value tensor to a scalar α.

                                                            Instances For