Soundness #

Tape-style (SSA/DAG) reverse-mode soundness for the proved-correct layer.

We model a dynamic graph as a sequence of nodes that may reference any previously computed values (so sharing/fan-out is allowed). For each node we assume a local JVP/VJP adjointness law, then prove the global reverse-mode accumulation algorithm is sound.

This is a proof-only layer; the runtime engine in NN.Runtime.Autograd.Engine is an executable implementation of the same idea.

PyTorch correspondence / citations #

This file is the proof analogue of PyTorch’s dynamic autograd engine building a tape of nodes during the forward pass and running a reverse pass that accumulates gradients at shared inputs. https://pytorch.org/docs/stable/autograd.html

References (background):

Reverse-mode AD as backpropagation on a computation graph is standard; see e.g. Baydin et al. (JMLR 2018) for an overview and terminology (JVP/VJP, duality, etc.).

source

theorem Proofs.Autograd.tensor_cast_shape_proof_irrel {α : Type} {s₁ s₂ : Spec.Shape} (t : Spec.Tensor α s₁) (p q : s₁ = s₂) :

t.castShape p = t.castShape q

source

@[reducible, inline]

abbrev Proofs.Autograd.TList :

List Spec.Shape → Type

A heterogeneous list of tensors indexed by a list of shapes.

This is the “typed context” used by the tape model: TList Γ stores one tensor for each shape in the list Γ.

PyTorch analogy: the tape stores “saved tensors”/intermediates for backward, but PyTorch stores them in an untyped runtime list; here the shapes are tracked in the type.

Implementation note: we reuse the type-level context container from the backend-generic tape development (Proofs.Autograd.Algebra.TList) and specialize it to ℝ. This avoids duplicating the basic “heterogeneous list indexed by shapes” encoding in two different places.

Instances For

source

@[reducible, inline]

abbrev Proofs.Autograd.TList.nil :

TList []

Constructor aliases for the specialized TList.

We reuse the inductive type from Proofs.Autograd.Algebra.TList, so its constructors are actually Proofs.Autograd.Algebra.TList.nil/cons. A few analytic files expect the shorter names Proofs.Autograd.TList.nil/cons, so we provide them here as abbreviations.

Instances For

source

@[reducible, inline]

abbrev Proofs.Autograd.TList.cons {s : Spec.Shape} {ss : List Spec.Shape} (x : Spec.Tensor ℝ s) (xs : TList ss) :

TList (s :: ss)

Constructor alias for TList.cons specialized to ℝ.

Instances For

source

@[reducible, inline]

abbrev Proofs.Autograd.TList.get {ss : List Spec.Shape} :

TList ss → (i : Fin ss.length) → Spec.Tensor ℝ (ss.get i)

Get the ith tensor from a context, with its shape tracked by List.get.

Instances For

source

@[reducible, inline]

abbrev Proofs.Autograd.TList.zero {ss : List Spec.Shape} :

TList ss

All-zero context (fills each tensor entry with zeros).

Instances For

source

@[reducible, inline]

abbrev Proofs.Autograd.TList.add {ss : List Spec.Shape} :

TList ss → TList ss → TList ss

Pointwise addition of two contexts of the same shape list.

Instances For

source

@[reducible, inline]

abbrev Proofs.Autograd.TList.snoc {τ : Spec.Shape} {ss : List Spec.Shape} :

TList ss → Spec.Tensor ℝ τ → TList (ss ++ [τ])

Append a tensor to the end of a context.

Instances For

source

@[reducible, inline]

abbrev Proofs.Autograd.TList.unsnoc {τ : Spec.Shape} {ss : List Spec.Shape} :

TList (ss ++ [τ]) → TList ss × Spec.Tensor ℝ τ

Split a context of shape list ss ++ [τ] into its prefix and last tensor.

Instances For

source

def Proofs.Autograd.TList.dotList {ss : List Spec.Shape} :

TList ss → TList ss → ℝ

Dot product over contexts: sum of per-entry tensor dot products.

Informally: dotList xs ys is the “context inner product” used to state global adjointness for tape evaluation and backprop.

Instances For

source

@[reducible, inline]

abbrev Proofs.Autograd.TList.cast {ss₁ ss₂ : List Spec.Shape} (h : ss₁ = ss₂) (xs : TList ss₁) :

TList ss₂

Cast a context along an equality of shape lists.

Instances For

source

theorem Proofs.Autograd.TList.dotList_cast_left {ss₁ ss₂ : List Spec.Shape} (h : ss₁ = ss₂) (x : TList ss₁) (y : TList ss₂) :

(cast h x).dotList y = x.dotList (cast ⋯ y)

source

theorem Proofs.Autograd.TList.dotList_add_right {ss : List Spec.Shape} (x y z : TList ss) :

x.dotList (y.add z) = x.dotList y + x.dotList z

dotList is linear in its right argument with respect to TList.add.

Informally: ⟪x, y + z⟫ = ⟪x, y⟫ + ⟪x, z⟫ for contexts.

source

theorem Proofs.Autograd.TList.dotList_snoc {ss : List Spec.Shape} {τ : Spec.Shape} (x y : TList ss) (a b : Spec.Tensor ℝ τ) :

(x.snoc a).dotList (y.snoc b) = x.dotList y + Spec.dot a b

dotList respects appending: dot of two snoced contexts splits into prefix + last entry.

Informally: ⟪(x,a), (y,b)⟫ = ⟪x,y⟫ + ⟪a,b⟫.

source

theorem Proofs.Autograd.TList.unsnoc_snoc {ss : List Spec.Shape} {τ : Spec.Shape} (xs : TList ss) (t : Spec.Tensor ℝ τ) :

(xs.snoc t).unsnoc = (xs, t)

unsnoc is a left inverse of snoc.

source

theorem Proofs.Autograd.TList.snoc_unsnoc {ss : List Spec.Shape} {τ : Spec.Shape} (xs : TList (ss ++ [τ])) :

xs.unsnoc.1.snoc xs.unsnoc.2 = xs

snoc is a right inverse of unsnoc.

source

theorem Proofs.Autograd.TList.dot_fill_zero_right {s : Spec.Shape} (a : Spec.Tensor ℝ s) :

Spec.dot a (Spec.fill 0 s) = 0

Dotting any tensor with a zero-filled tensor gives 0.

This is the tensor-level fact used to show that “one-hot” cotangents behave as expected.

source

theorem Proofs.Autograd.TList.dotList_zero_right {ss : List Spec.Shape} (x : TList ss) :

x.dotList zero = 0

dotList x 0 = 0 for the all-zero context.

source

structure Proofs.Autograd.Idx (Γ : List Spec.Shape) (s : Spec.Shape) :

Type

An index into a heterogeneous context, carrying a proof of the expected shape.

This lets us talk about “the ith saved tensor has shape s” without losing the shape invariant.

i : Fin Γ.length
i.
h : Γ.get self.i = s
h.

Instances For

source

def Proofs.Autograd.getIdx {Γ : List Spec.Shape} {s : Spec.Shape} (xs : TList Γ) (idx : Idx Γ s) :

Spec.Tensor ℝ s

Read an element from a context using an index with an attached shape proof.

Instances For

source

def Proofs.Autograd.TList.single {Γ : List Spec.Shape} {s : Spec.Shape} (idx : Idx Γ s) (v : Spec.Tensor ℝ s) :

TList Γ

Build a sparse context with a single nonzero entry at idx and zeros elsewhere.

This is used to express “one-hot” cotangents when proving local-to-global backprop correctness.

Instances For

source

theorem Proofs.Autograd.TList.dotList_single {Γ : List Spec.Shape} {s : Spec.Shape} (dx : TList Γ) (idx : Idx Γ s) (v : Spec.Tensor ℝ s) :

dx.dotList (single idx v) = Spec.dot (getIdx dx idx) v

single idx v is the “one-hot” context with value v at idx, and zeros elsewhere.

This lemma says the context dot product against single idx v picks out the corresponding entry of dx:

⟪dx, single idx v⟫ = ⟪dx[idx], v⟫.

source

structure Proofs.Autograd.Node (Γ : List Spec.Shape) (τ : Spec.Shape) :

Type

A node with local JVP/VJP and an adjointness proof against the tensor dot product.

forward : TList Γ → Spec.Tensor ℝ τ
forward.
jvp : TList Γ → TList Γ → Spec.Tensor ℝ τ
jvp.
vjp : TList Γ → Spec.Tensor ℝ τ → TList Γ
vjp.
correct (x dx : TList Γ) (δ : Spec.Tensor ℝ τ) : Spec.dot (self.jvp x dx) δ = dx.dotList (self.vjp x δ)
correct.

Instances For

source

inductive Proofs.Autograd.Graph (Γ : List Spec.Shape) :

List Spec.Shape → Type

A tape/SSA graph: nodes are appended in topological order and may reference any previous value.

nil {Γ : List Spec.Shape} : Graph Γ []
snoc {Γ ss : List Spec.Shape} {τ : Spec.Shape} : Graph Γ ss → Node (Γ ++ ss) τ → Graph Γ (ss ++ [τ])

Instances For

source

def Proofs.Autograd.Graph.eval {Γ ss : List Spec.Shape} (g : Graph Γ ss) (x : TList Γ) :

TList (Γ ++ ss)

Evaluate a tape/graph, returning the full context (inputs ++ intermediates).

Instances For

source

def Proofs.Autograd.Graph.jvpCtx {Γ ss : List Spec.Shape} (g : Graph Γ ss) (x dx : TList Γ) :

TList (Γ ++ ss)

Evaluate the JVP (“forward-mode tangent”) of a graph, producing tangents for all values in the extended context Γ ++ ss.

Instances For

source

def Proofs.Autograd.Graph.backpropCtx {Γ ss : List Spec.Shape} (g : Graph Γ ss) (x : TList Γ) (seed : TList (Γ ++ ss)) :

TList Γ

Reverse-mode backpropagation on a tape/graph, returning gradients for the inputs Γ.

This is the proof model of what PyTorch calls “running backward” starting from an output seed cotangent and accumulating gradients at shared parents.

Instances For

source

theorem Proofs.Autograd.Graph.backprop_correct {Γ ss : List Spec.Shape} (g : Graph Γ ss) (x dx : TList Γ) (seed : TList (Γ ++ ss)) :

(g.jvpCtx x dx).dotList seed = dx.dotList (g.backpropCtx x seed)

Global tape soundness: if each node satisfies a local JVP/VJP adjointness law, then the global reverse-mode accumulation algorithm (backpropCtx) is correct.

Informally: for any input perturbation dx and any output seed cotangent seed,

⟪JVP(g, x, dx), seed⟫ = ⟪dx, backprop(g, x, seed)⟫.

This is the formal analogue of PyTorch’s guarantee that backward() computes vector–Jacobian products and accumulates them through a dynamic DAG/tape.

TorchLean API

NN.Proofs.Autograd.Tape.Core.Soundness

Soundness #

PyTorch correspondence / citations #