FDeriv #

Analytic (HasFDerivAt/fderiv) correctness for tape-style SSA/DAG graphs.

NN/Proofs/Autograd/Tape/Core/Soundness.lean proves the global JVP/VJP adjointness law for DAG graphs against the tensor dot product.

This file adds the analytic upgrade (spec-level over ℝ):

vectorize heterogeneous contexts into Euclidean space;
assume each node's JVP is the Fréchet derivative of its forward map;
derive jvp = fderiv and therefore backprop = (fderiv eval)†.

PyTorch correspondence / citations #

backpropVec is the proof-level analogue of a VJP accumulation pass over a dynamic tape. The main theorem backpropVec_eq_adjoint_fderiv corresponds to the slogan “reverse-mode = adjoint of the derivative of the forward map”. https://pytorch.org/docs/stable/autograd.html
For PyTorch’s functional API perspective (Jacobian/VJP/JVP): see the “functional higher level” autograd docs. https://pytorch.org/docs/stable/autograd.html#functional-higher-level-api

source

noncomputable def Proofs.Autograd.toVecT {s : Spec.Shape} (t : Spec.Tensor ℝ s) :

Vec s.size

Vectorize a tensor by flattening it (spec flattening order) and then using the Euclidean equivalence.

Instances For

source

noncomputable def Proofs.Autograd.ofVecT {s : Spec.Shape} (v : Vec s.size) :

Spec.Tensor ℝ s

Inverse of toVecT: interpret a vector as a tensor of shape s.

Instances For

source

@[simp]

theorem Proofs.Autograd.toVecT_ofVecT {s : Spec.Shape} (v : Vec s.size) :

toVecT (ofVecT v) = v

source

@[simp]

theorem Proofs.Autograd.ofVecT_toVecT {s : Spec.Shape} (t : Spec.Tensor ℝ s) :

ofVecT (toVecT t) = t

source

def Proofs.Autograd.ctxSize :

List Spec.Shape → ℕ

Total number of scalar coordinates in a heterogeneous context shape list.

Instances For

source

@[reducible, inline]

abbrev Proofs.Autograd.CtxVec (Γ : List Spec.Shape) :

Type

A vectorized context: one Euclidean vector containing all TList Γ entries concatenated.

Instances For

source

noncomputable def Proofs.Autograd.vecOfFun {n : ℕ} (f : Fin n → ℝ) :

Vec n

Build a Euclidean vector from its coordinate function.

This is the small helper used throughout the file when reindexing vectors across context concatenation and shape casts.

Instances For

source

@[simp]

theorem Proofs.Autograd.vecOfFun_apply {n : ℕ} (f : Fin n → ℝ) (i : Fin n) :

(vecOfFun f).ofLp i = f i

source

@[simp]

theorem Proofs.Autograd.vecOfFun_eta {n : ℕ} (v : Vec n) :

(vecOfFun fun (i : Fin n) => v.ofLp i) = v

source

def Proofs.Autograd.flattenCtx {Γ : List Spec.Shape} :

TList Γ → CtxVec Γ

Flatten a typed context TList Γ into one big Euclidean vector.

Unlike PyTorch’s runtime “saved tensor list”, this is an actual typed isomorphism: shapes are tracked in Γ, so the split points are definitional from ctxSize.

Instances For

source

def Proofs.Autograd.unflattenCtx {Γ : List Spec.Shape} :

CtxVec Γ → TList Γ

Inverse of flattenCtx: split a CtxVec Γ back into a TList Γ.

Instances For

source

@[simp]

theorem Proofs.Autograd.unflattenCtx_flattenCtx {Γ : List Spec.Shape} (xs : TList Γ) :

unflattenCtx (flattenCtx xs) = xs

source

@[simp]

theorem Proofs.Autograd.flattenCtx_unflattenCtx {Γ : List Spec.Shape} (v : CtxVec Γ) :

flattenCtx (unflattenCtx v) = v

source

noncomputable def Proofs.Autograd.castVec {n m : ℕ} (h : n = m) :

Vec n → Vec m

Cast a Vec n to Vec m along an equality, by reindexing coordinates.

Instances For

source

@[simp]

theorem Proofs.Autograd.castVec_apply {n m : ℕ} (h : n = m) (v : Vec n) (i : Fin m) :

(castVec h v).ofLp i = v.ofLp (Fin.cast ⋯ i)

source

@[simp]

theorem Proofs.Autograd.castVec_rfl {n : ℕ} (v : Vec n) :

castVec ⋯ v = v

source

@[simp]

theorem Proofs.Autograd.castVec_add {n m : ℕ} (h : n = m) (u v : Vec n) :

castVec h (u + v) = castVec h u + castVec h v

source

@[simp]

theorem Proofs.Autograd.castVec_smul {n m : ℕ} (h : n = m) (r : ℝ) (v : Vec n) :

castVec h (r • v) = r • castVec h v

source

@[simp]

theorem Proofs.Autograd.castVec_castVec {n m k : ℕ} (h₁ : n = m) (h₂ : m = k) (v : Vec n) :

castVec h₂ (castVec h₁ v) = castVec ⋯ v

source

theorem Proofs.Autograd.inner_castVec_castVec {n m : ℕ} (h : n = m) (x y : Vec n) :

inner ℝ (castVec h x) (castVec h y) = inner ℝ x y

castVec preserves the Euclidean inner product.

This is the core “cast isometry” lemma used throughout the vectorized graph development.

source

theorem Proofs.Autograd.sum_spec_dim {n : ℕ} {s : Spec.Shape} (values : Fin n → Spec.Tensor ℝ s) :

(Spec.Tensor.dim values).sumSpec = ∑ i : Fin n, (values i).sumSpec

sum_spec over an outer dimension is a sum over slices.

This tensor-level “Fubini rule” is used to relate Spec.dot to Euclidean inner products after vectorization.

source

theorem Proofs.Autograd.toVecT_dim_apply {n : ℕ} {s : Spec.Shape} (hmpos : 0 < s.size) (f : Fin n → Spec.Tensor ℝ s) (p : Fin n × Fin s.size) :

(toVecT (Spec.Tensor.dim f)).ofLp (finProdFinEquiv p) = (toVecT (f p.1)).ofLp p.2

Coordinate characterization of toVecT on a tensor .dim n s.

Informally, the vectorization order is the standard product order induced by finProdFinEquiv.

source

theorem Proofs.Autograd.inner_toVecT_dim {n : ℕ} {s : Spec.Shape} (a b : Fin n → Spec.Tensor ℝ s) :

inner ℝ (toVecT (Spec.Tensor.dim a)) (toVecT (Spec.Tensor.dim b)) = ∑ i : Fin n, inner ℝ (toVecT (a i)) (toVecT (b i))

toVecT turns dot products on .dim n s into sums of Euclidean inner products over slices.

source

theorem Proofs.Autograd.dot_eq_inner_toVecT {s : Spec.Shape} (a b : Spec.Tensor ℝ s) :

Spec.dot a b = inner ℝ (toVecT a) (toVecT b)

Main compatibility lemma: tensor dot equals Euclidean inner product of vectorizations.

This is the bridge between soundness.lean (stated using Spec.dot) and the analytic theorems here (stated using Euclidean inner).

source

noncomputable def Proofs.Autograd.appendVec {m n : ℕ} (a : Vec m) (b : Vec n) :

Vec (m + n)

Concatenate two Euclidean vectors using Fin.append.

Instances For

source

theorem Proofs.Autograd.inner_append {m n : ℕ} (a c : Vec m) (b d : Vec n) :

inner ℝ (appendVec a b) (appendVec c d) = inner ℝ a c + inner ℝ b d

Inner product of concatenated vectors splits as a sum of inner products.

source

theorem Proofs.Autograd.dotList_eq_inner_flattenCtx {Γ : List Spec.Shape} (x y : TList Γ) :

x.dotList y = inner ℝ (flattenCtx x) (flattenCtx y)

TList.dotList equals Euclidean inner product of flattenCtx.

This shows that the “context inner product” used in tape soundness is exactly the Euclidean inner product on the vectorized context representation.

source

noncomputable def Proofs.Autograd.castCtxVec {Γ₁ Γ₂ : List Spec.Shape} (h : Γ₁ = Γ₂) :

CtxVec Γ₁ → CtxVec Γ₂

Cast a vectorized context along an equality of shape lists (reindexing coordinates).

Instances For

source

@[simp]

theorem Proofs.Autograd.castCtxVec_rfl {Γ : List Spec.Shape} (v : CtxVec Γ) :

castCtxVec ⋯ v = v

source

@[simp]

theorem Proofs.Autograd.castCtxVec_cast {Γ₁ Γ₂ Γ₃ : List Spec.Shape} (h₁ : Γ₁ = Γ₂) (h₂ : Γ₂ = Γ₃) (v : CtxVec Γ₁) :

castCtxVec h₂ (castCtxVec h₁ v) = castCtxVec ⋯ v

The next few lemmas are bookkeeping for splitting/concatenating vectorized contexts. They are “obvious” from the list structure of Γ, but it is useful to expose them as named facts so that the calculus proofs later can use them without redoing shape arithmetic.

source

theorem Proofs.Autograd.inner_castCtxVec {Γ₁ Γ₂ : List Spec.Shape} (h : Γ₁ = Γ₂) (x : CtxVec Γ₁) (y : CtxVec Γ₂) :

inner ℝ (castCtxVec h x) y = inner ℝ x (castCtxVec ⋯ y)

castCtxVec is inner-product preserving (up to flipping the cast on the other argument).

source

theorem Proofs.Autograd.ctxSize_append (Γ ss : List Spec.Shape) :

ctxSize (Γ ++ ss) = ctxSize Γ + ctxSize ss

ctxSize respects list append (sizes add).

source

theorem Proofs.Autograd.ctxSize_snoc (ss : List Spec.Shape) (τ : Spec.Shape) :

ctxSize (ss ++ [τ]) = ctxSize ss + τ.size

Specialized ctxSize_append for snoc (Γ ++ [τ]).

source

noncomputable def Proofs.Autograd.snocCtx {Γ : List Spec.Shape} {τ : Spec.Shape} (ctx : CtxVec Γ) (t : Vec τ.size) :

CtxVec (Γ ++ [τ])

Append one tensor-vector block to a vectorized context.

Instances For

source

noncomputable def Proofs.Autograd.unsnocCtx {Γ : List Spec.Shape} {τ : Spec.Shape} (ctx : CtxVec (Γ ++ [τ])) :

CtxVec Γ × Vec τ.size

Inverse of snocCtx: split CtxVec (Γ ++ [τ]) into its prefix and last block.

Instances For

source

theorem Proofs.Autograd.unsnocCtx_snocCtx {Γ : List Spec.Shape} {τ : Spec.Shape} (ctx : CtxVec Γ) (t : Vec τ.size) :

unsnocCtx (snocCtx ctx t) = (ctx, t)

unsnocCtx (snocCtx ctx t) = (ctx, t).

source

theorem Proofs.Autograd.snocCtx_unsnocCtx {Γ : List Spec.Shape} {τ : Spec.Shape} (ctx : CtxVec (Γ ++ [τ])) :

snocCtx (unsnocCtx ctx).1 (unsnocCtx ctx).2 = ctx

snocCtx (unsnocCtx ctx) = ctx.

source

noncomputable def Proofs.Autograd.Node.forwardVec {Γ : List Spec.Shape} {τ : Spec.Shape} (node : Node Γ τ) :

CtxVec Γ → Vec τ.size

Vectorized forward map of a tape Node: CtxVec Γ → Vec (Shape.size τ).

Instances For

source

noncomputable def Proofs.Autograd.Node.jvpVec {Γ : List Spec.Shape} {τ : Spec.Shape} (node : Node Γ τ) :

CtxVec Γ → CtxVec Γ → Vec τ.size

Vectorized JVP of a tape Node: the node-level forward-mode action on tangents.

Instances For

source

noncomputable def Proofs.Autograd.Node.vjpVec {Γ : List Spec.Shape} {τ : Spec.Shape} (node : Node Γ τ) :

CtxVec Γ → Vec τ.size → CtxVec Γ

Vectorized VJP of a tape Node: pushes a cotangent vector back to the input context.

Instances For

source

theorem Proofs.Autograd.Node.correct_inner {Γ : List Spec.Shape} {τ : Spec.Shape} (node : Node Γ τ) (ctxV dctxV : CtxVec Γ) (δV : Vec τ.size) :

inner ℝ (node.jvpVec ctxV dctxV) δV = inner ℝ dctxV (node.vjpVec ctxV δV)

Vectorized form of Node.correct (adjointness law).

Statement: ⟪jvp(x,dx), δ⟫ = ⟪dx, vjp(x,δ)⟫.

source

def Proofs.Autograd.Graph.evalVec {Γ ss : List Spec.Shape} (g : Graph Γ ss) (xV : CtxVec Γ) :

CtxVec (Γ ++ ss)

Vectorized evaluation of a tape Graph.

Returns a CtxVec (Γ ++ ss) containing the original inputs and all intermediate node outputs.

Instances For

source

def Proofs.Autograd.Graph.jvpVec {Γ ss : List Spec.Shape} (g : Graph Γ ss) (xV dxV : CtxVec Γ) :

CtxVec (Γ ++ ss)

Vectorized JVP for a whole graph: forward-mode derivative of evalVec.

Instances For

source

def Proofs.Autograd.Graph.backpropVec {Γ ss : List Spec.Shape} (g : Graph Γ ss) (xV : CtxVec Γ) (seedV : CtxVec (Γ ++ ss)) :

CtxVec Γ

Vectorized reverse-mode accumulation (VJP) for a whole graph.

seedV is a cotangent for the entire Γ ++ ss context (inputs plus intermediates), matching the global tape soundness theorem.

Instances For

The next theorem is exactly soundness.lean rewritten into Euclidean vector form. It is the key input to later “backprop = (fderiv eval)†” proofs.

source

theorem Proofs.Autograd.Graph.backprop_correct_inner {Γ ss : List Spec.Shape} (g : Graph Γ ss) (xV dxV : CtxVec Γ) (seedV : CtxVec (Γ ++ ss)) :

inner ℝ (g.jvpVec xV dxV) seedV = inner ℝ dxV (g.backpropVec xV seedV)

Vectorized tape soundness: ⟪jvp, seed⟫ = ⟪dx, backprop seed⟫.

source

structure Proofs.Autograd.NodeFDerivCorrect {Γ : List Spec.Shape} {τ : Spec.Shape} (node : Node Γ τ) :

Type

Per-node analytic correctness assumption: JVP is the Fréchet derivative.

This is the hypothesis that upgrades the dot-level soundness theorem into an fderiv statement.

deriv : CtxVec Γ → CtxVec Γ →L[ℝ ] Vec τ.size
The derivative packaged as a continuous linear map.
hasFDerivAt (xV : CtxVec Γ) : HasFDerivAt node.forwardVec (self.deriv xV) xV
The forward map has the above derivative everywhere.
jvp_eq (xV dxV : CtxVec Γ) : node.jvpVec xV dxV = (self.deriv xV) dxV
The node's JVP function agrees with the packaged derivative.

Instances For

source

def Proofs.Autograd.GraphFDerivCorrect {Γ ss : List Spec.Shape} :

Graph Γ ss → Type

Graph predicate: every node satisfies NodeFDerivCorrect.

Instances For

source

structure Proofs.Autograd.NodeFDerivCorrectAt {Γ : List Spec.Shape} {τ : Spec.Shape} (node : Node Γ τ) (xV : CtxVec Γ) :

Type

Pointwise per-node analytic correctness.

Used when a node is only differentiable under side conditions at a particular basepoint xV (e.g. inv, sqrt, log, or piecewise ops).

deriv : CtxVec Γ →L[ℝ ] Vec τ.size
deriv.
hasFDerivAt : HasFDerivAt node.forwardVec self.deriv xV
has FDeriv At.
jvp_eq (dxV : CtxVec Γ) : node.jvpVec xV dxV = self.deriv dxV
jvp eq.

Instances For

source

def Proofs.Autograd.NodeFDerivCorrect.at {Γ : List Spec.Shape} {τ : Spec.Shape} {node : Node Γ τ} (hn : NodeFDerivCorrect node) (xV : CtxVec Γ) :

NodeFDerivCorrectAt node xV

Specialize a global NodeFDerivCorrect proof to a particular basepoint.

This is the common “turn an everywhere-differentiable node into a pointwise differentiable node” adapter used when assembling GraphFDerivCorrectAt proofs.

Instances For

source

def Proofs.Autograd.GraphFDerivCorrectAt {Γ ss : List Spec.Shape} :

Graph Γ ss → CtxVec Γ → Type

Pointwise graph predicate: every node is differentiable at the actual intermediate values.

Note the recursion uses Graph.evalVec to compute the basepoint for each successive node.

Instances For

source

noncomputable def Proofs.Autograd.Graph.appendCLM (m n : ℕ) :

Vec m × Vec n →L[ℝ ] Vec (m + n)

Fin.append packaged as a continuous linear map on Euclidean vectors.

Instances For

source

noncomputable def Proofs.Autograd.Graph.castCLM {n m : ℕ} (h : n = m) :

Vec n →L[ℝ ] Vec m

castVec packaged as a continuous linear map (finite-dimensional, hence continuous).

Instances For

source

noncomputable def Proofs.Autograd.Graph.snocCLM {Γ : List Spec.Shape} {τ : Spec.Shape} :

CtxVec Γ × Vec τ.size →L[ℝ ] CtxVec (Γ ++ [τ])

Continuous linear map version of snocCtx (concatenation + cast).

Instances For

source

theorem Proofs.Autograd.Graph.hasFDerivAt_evalVec_and_jvp {Γ ss : List Spec.Shape} (g : Graph Γ ss) (hg : GraphFDerivCorrect g) (xV : CtxVec Γ) :

∃ (D : CtxVec Γ →L[ℝ ] CtxVec (Γ ++ ss)), HasFDerivAt g.evalVec D xV ∧ ∀ (dxV : CtxVec Γ), g.jvpVec xV dxV = D dxV

Main induction: evalVec is differentiable and its derivative agrees with jvpVec.

This is the technical heart of the jvp = fderiv theorem.

Convenience corollaries:

Once we have HasFDerivAt evalVec = jvpVec, the rest are immediate: jvpVec = fderiv, then backpropVec = (fderiv evalVec)† by the inner-product characterization of adjoints.

source

theorem Proofs.Autograd.Graph.jvpVec_eq_fderiv {Γ ss : List Spec.Shape} (g : Graph Γ ss) (hg : GraphFDerivCorrect g) (xV dxV : CtxVec Γ) :

g.jvpVec xV dxV = (fderiv ℝ g.evalVec xV) dxV

Under GraphFDerivCorrect, the graph JVP equals the Fréchet derivative fderiv of evalVec.

source

theorem Proofs.Autograd.Graph.backpropVec_eq_adjoint_fderiv {Γ ss : List Spec.Shape} (g : Graph Γ ss) (hg : GraphFDerivCorrect g) (xV : CtxVec Γ) (seedV : CtxVec (Γ ++ ss)) :

g.backpropVec xV seedV = (ContinuousLinearMap.adjoint (fderiv ℝ g.evalVec xV)) seedV

Main analytic theorem: backpropVec equals the adjoint of the derivative of evalVec.

This is the proof-level formalization of “reverse-mode computes a VJP”, stated as an equality of linear maps in a Euclidean space.

source

theorem Proofs.Autograd.Graph.hasFDerivAt_evalVec_and_jvp_at {Γ ss : List Spec.Shape} (g : Graph Γ ss) (xV : CtxVec Γ) :

∀ (a : GraphFDerivCorrectAt g xV), ∃ (D : CtxVec Γ →L[ℝ ] CtxVec (Γ ++ ss)), HasFDerivAt g.evalVec D xV ∧ ∀ (dxV : CtxVec Γ), g.jvpVec xV dxV = D dxV

Pointwise induction: evalVec is differentiable at xV, and its derivative agrees with jvpVec.

This is the version used for graphs involving non-smooth or partial primitives, where we only assume differentiability at the values encountered during execution.

Pointwise corollaries: these mirror jvpVec_eq_fderiv and backpropVec_eq_adjoint_fderiv, but only require GraphFDerivCorrectAt at the specific execution point.

source

theorem Proofs.Autograd.Graph.jvpVec_eq_fderiv_at {Γ ss : List Spec.Shape} (g : Graph Γ ss) (xV dxV : CtxVec Γ) :

∀ (a : GraphFDerivCorrectAt g xV), g.jvpVec xV dxV = (fderiv ℝ g.evalVec xV) dxV

Pointwise version of jvpVec_eq_fderiv.

source

theorem Proofs.Autograd.Graph.backpropVec_eq_adjoint_fderiv_at {Γ ss : List Spec.Shape} (g : Graph Γ ss) (xV : CtxVec Γ) (seedV : CtxVec (Γ ++ ss)) :

∀ (a : GraphFDerivCorrectAt g xV), g.backpropVec xV seedV = (ContinuousLinearMap.adjoint (fderiv ℝ g.evalVec xV)) seedV

Pointwise version of backpropVec_eq_adjoint_fderiv.

TorchLean API

NN.Proofs.Autograd.Tape.Core.FDeriv

FDeriv #

PyTorch correspondence / citations #