TorchLean API

NN.Runtime.Autograd.Engine.Core.Core

Engine Core #

A small dynamic (DAG) autograd engine.

This is intended to be the "runtime" counterpart to Spec.OpSpec: instead of manually writing backward passes end-to-end, you build a tape during a forward pass and then call backward.

Design goals:

Scope boundaries:

References (PyTorch / background reading):

Core Types #

The eager autograd engine is built out of a few small pieces:

@[reducible, inline]

Runtime error monad for the eager autograd engine.

We use plain Except String (instead of IO exceptions) so the tape constructors remain pure and easy to test. Front-ends that prefer exceptions can use okOrThrow.

Instances For

    Convert an Autograd.Result into an IO action by throwing IO.userError on failure.

    This is mainly used by the imperative Torch/TorchLean front-ends to keep their code readable.

    Instances For

      Pack a typed tensor as a runtime AnyTensor.

      This is the primary bridge between the dependently-typed Tensor α s world and the dynamic tape, which stores heterogeneous shapes in a single array.

      Instances For
        def Runtime.Autograd.AnyTensor.cast {α : Type} {s₂ : Spec.Shape} (t : AnyTensor α) (h : t.s = s₂) :
        Spec.Tensor α s₂

        Cast an AnyTensor to a specific shape, given an equality proof.

        This is used after dynamic shape checks (e.g. Tape.requireValue).

        Instances For

          Accumulate two AnyTensor values by elementwise addition, with a dynamic shape check.

          This is the heart of DAG support: if two different paths contribute gradients to the same parent, we sum the contributions.

          Instances For
            structure Runtime.Autograd.Node (α : Type) :

            A tape node representing a single tensor value in the recorded computation graph.

            Fields:

            • value: the forward value (shape-erased).
            • parents: ids of parent nodes in the tape.
            • backward: a local VJP rule. Given an upstream cotangent for value, it returns a list of (parentId, parentCotangent) contributions (one per parent, usually).

            PyTorch comparison: analogous to an autograd Function instance + saved tensors, but here we store the backward closure directly.

            • name : Option String

              Optional node name used for debugging and pretty-printing.

            • value : AnyTensor α

              Forward value computed at this node (shape-erased).

            • requires_grad : Bool

              Whether reverse-mode propagation should visit this node.

              If false, reverse-mode traversal skips this node and does not accumulate gradients into it.

            • parents : List

              Parent node ids (dependencies) in the tape.

            • backward : AnyTensor αResult (List ( × AnyTensor α))

              Local VJP rule for this node.

              Given an upstream cotangent for value, return a list of (parentId, parentCotangent) contributions. If multiple children contribute to the same parent, the engine will sum contributions via AnyTensor.add.

            Instances For
              structure Runtime.Autograd.Tape (α : Type) :

              Autograd tape: a grow-only array of nodes.

              Node ids are array indices (Nat). All ops append exactly one node and return its id. This makes it easy to implement reverse-mode by traversing ids in reverse order.

              • nodes : Array (Node α)

                Tape nodes in evaluation order.

                Node ids are array indices (Nat). Each tape op appends exactly one node and returns its id.

              Instances For

                Tape Construction #

                The Tape namespace provides pure constructors for building a recorded computation graph. Each op appends exactly one node and returns the updated tape plus the new node id.

                If you prefer an implicit tape-threading style, see NN.Runtime.Autograd.Engine.TapeM.

                Empty tape (no nodes).

                Instances For

                  Number of nodes stored in the tape.

                  Instances For
                    def Runtime.Autograd.Tape.getNode? {α : Type} (t : Tape α) (id : ) :

                    Read a node by id (returns none if out of bounds).

                    Instances For
                      def Runtime.Autograd.Tape.getValue? {α : Type} (t : Tape α) (id : ) :

                      Read just the stored forward value for a node id.

                      Instances For
                        def Runtime.Autograd.Tape.addNode {α : Type} (t : Tape α) (node : Node α) :

                        Append a node and return its id.

                        Invariant: the returned id is t.size, the pre-append size of the tape.

                        Instances For
                          @[simp]
                          theorem Runtime.Autograd.Tape.addNode_id {α : Type} (t : Tape α) (node : Node α) :
                          (t.addNode node).2 = t.size

                          addNode returns the current tape size as the fresh node id.

                          @[simp]
                          theorem Runtime.Autograd.Tape.size_addNode {α : Type} (t : Tape α) (node : Node α) :
                          (t.addNode node).1.size = t.size + 1

                          Appending a node increases the tape size by one.

                          def Runtime.Autograd.Tape.leaf {α : Type} {s : Spec.Shape} (t : Tape α) (value : Spec.Tensor α s) (name : Option String := none) (requires_grad : Bool := true) :

                          Add a leaf node (no parents).

                          PyTorch comparison: a tensor that enters the graph as a leaf (e.g. input or parameter value).

                          Instances For

                            Read a typed tensor value from a tape node id.

                            This is the main "dynamic check" boundary in the eager runtime:

                            • fails if the id is invalid, or
                            • fails if the stored runtime shape doesn't match the expected dependent shape s.
                            Instances For

                              Read a typed upstream gradient tensor from a runtime AnyTensor.

                              This is the backward analogue of Tape.requireValue: it checks that the upstream gradient has the expected shape τ and then performs the dependent cast.

                              Instances For
                                def Runtime.Autograd.Tape.unary {α : Type} [DecidableEq Spec.Shape] {σ τ : Spec.Shape} (t : Tape α) (opName : String) (xId : ) (forward : Spec.Tensor α σSpec.Tensor α τ) (backward : Spec.Tensor α σSpec.Tensor α τSpec.Tensor α σ) :

                                Generic constructor for unary ops.

                                You provide:

                                • forward : Tensor α σ → Tensor α τ
                                • backward : Tensor α σ → Tensor α τ → Tensor α σ (a VJP rule; note it may depend on the input)

                                The returned node stores the forward value and a backward closure that checks the upstream gradient's shape and returns the parent contribution.

                                Instances For