TorchLean API

NN.Proofs.RuntimeApprox.NF.BackwardOps

BackwardOps #

NF (rounded) backend: backward/VJP runtime→spec approximation lemmas.

This file instantiates the backend-agnostic reverse framework NN.Proofs.RuntimeApprox.Graph.BackwardApprox for the proof-relevant rounded runtime NF.

Scope:

Notes:

PyTorch correspondence / citations #

This provides per-op VJP rules and error bounds analogous to what PyTorch Autograd computes during backward passes on a computation graph. https://pytorch.org/docs/stable/autograd.html

Map of this file #

References #

Sparse Contexts For Local VJPs #

A TList filled with zeros (shape-wise), used to build sparse contexts for local VJPs.

Instances For
    def Proofs.RuntimeApprox.TList.setIdx {α : Type} [Zero α] {Γ : List Spec.Shape} {s : Spec.Shape} :
    Idx Γ sSpec.Tensor α sTList α Γ

    Set a single Idx position in a TList, filling all other entries with zeros.

    Instances For
      def Proofs.RuntimeApprox.TList.set2Idx {α : Type} [Zero α] [Add α] {Γ : List Spec.Shape} {s₁ s₂ : Spec.Shape} :
      Idx Γ s₁Spec.Tensor α s₁Idx Γ s₂Spec.Tensor α s₂TList α Γ

      Set two indices; if they coincide, add the contributions at that position.

      Instances For

        An EList filled with zeros, used for sparse error-bound contexts.

        Instances For

          Set a single Idx position in an EList, filling all other entries with zeros.

          Instances For
            def Proofs.RuntimeApprox.EList.set2Idx {Γ : List Spec.Shape} {s₁ s₂ : Spec.Shape} :
            Idx Γ s₁Idx Γ s₂EList Γ

            Set two indices in an EList; if they coincide, use the supplied combined value eBoth.

            Instances For
              def Proofs.RuntimeApprox.TList.set3IdxNe {α : Type} [Zero α] [Add α] {Γ : List Spec.Shape} {s₁ s₂ s₃ : Spec.Shape} (a : Idx Γ s₁) :
              Spec.Tensor α s₁(b : Idx Γ s₂) → Spec.Tensor α s₂(c : Idx Γ s₃) → Spec.Tensor α s₃a.i b.ia.i c.ib.i c.iTList α Γ

              Set three indices when the positions are pairwise distinct.

              This avoids any context-wise addition: only the three targeted positions are written, and all others are 0. This is important for NF, where even x + 0 would incur rounding.

              Instances For
                def Proofs.RuntimeApprox.EList.set3IdxNe {Γ : List Spec.Shape} {s₁ s₂ s₃ : Spec.Shape} (a : Idx Γ s₁) :
                (b : Idx Γ s₂) → (c : Idx Γ s₃) → a.i b.ia.i c.ib.i c.iEList Γ

                Error list for TList.set3Idx_ne: set the three designated positions, 0 elsewhere.

                Instances For

                  NF Backend Instantiation #

                  theorem Proofs.RuntimeApprox.NFBackend.approxT_fill_const {β : TorchLean.Floats.NeuralRadix} {fexp : } {rnd : } {cS : } {cR : TorchLean.Floats.NF β fexp rnd} {eps : } (h : |toSpec cR - cS| eps) {s : Spec.Shape} :
                  approxT toSpec (Spec.fill cS s) (Spec.fill cR s) eps
                  theorem Proofs.RuntimeApprox.NFBackend.idx_shape_eq_of_i_eq {Γ : List Spec.Shape} {s₁ s₂ : Spec.Shape} (a : Idx Γ s₁) (b : Idx Γ s₂) (h : a.i = b.i) :
                  s₁ = s₂
                  def Proofs.RuntimeApprox.NFBackend.tensorCastOfIdxEq {α : Type} {Γ : List Spec.Shape} {s₁ s₂ : Spec.Shape} (a : Idx Γ s₁) (b : Idx Γ s₂) (h : a.i = b.i) :
                  Spec.Tensor α s₂Spec.Tensor α s₁

                  Cast a tensor across a shape equality induced by equal Idx positions.

                  Given a : Idx Γ s₁, b : Idx Γ s₂, and h : a.i = b.i, this produces a function Tensor α s₂ → Tensor α s₁ that casts along the implied equality s₁ = s₂.

                  Instances For
                    theorem Proofs.RuntimeApprox.NFBackend.approxT_tensor_cast {β : TorchLean.Floats.NeuralRadix} {fexp : } {rnd : } {s t : Spec.Shape} (h : s = t) {xS : Spec.SpecTensor s} {xR : Spec.Tensor (TorchLean.Floats.NF β fexp rnd) s} {eps : } (hx : approxT toSpec xS xR eps) :
                    theorem Proofs.RuntimeApprox.NFBackend.approxCtx_set2Idx_ne {β : TorchLean.Floats.NeuralRadix} {fexp : } [TorchLean.Floats.NeuralValidExp fexp] {rnd : } [TorchLean.Floats.NeuralValidRndToNearest rnd] {Γ : List Spec.Shape} {s₁ s₂ : Spec.Shape} (a : Idx Γ s₁) (b : Idx Γ s₂) {t₁S : Spec.SpecTensor s₁} {t₁R : Spec.Tensor (TorchLean.Floats.NF β fexp rnd) s₁} {eps₁ : } {t₂S : Spec.SpecTensor s₂} {t₂R : Spec.Tensor (TorchLean.Floats.NF β fexp rnd) s₂} {eps₂ : } (h₁ : approxT toSpec t₁S t₁R eps₁) (h₂ : approxT toSpec t₂S t₂R eps₂) (hne : a.i b.i) :
                    approxCtx toSpec (TList.set2Idx a t₁S b t₂S) (TList.set2Idx a t₁R b t₂R) (EList.set2Idx a eps₁ b eps₂ 0)
                    theorem Proofs.RuntimeApprox.NFBackend.approxCtx_set3Idx_ne {β : TorchLean.Floats.NeuralRadix} {fexp : } [TorchLean.Floats.NeuralValidExp fexp] {rnd : } [TorchLean.Floats.NeuralValidRndToNearest rnd] {Γ : List Spec.Shape} {s₁ s₂ s₃ : Spec.Shape} (a : Idx Γ s₁) (b : Idx Γ s₂) (c : Idx Γ s₃) {t₁S : Spec.SpecTensor s₁} {t₁R : Spec.Tensor (TorchLean.Floats.NF β fexp rnd) s₁} {eps₁ : } {t₂S : Spec.SpecTensor s₂} {t₂R : Spec.Tensor (TorchLean.Floats.NF β fexp rnd) s₂} {eps₂ : } {t₃S : Spec.SpecTensor s₃} {t₃R : Spec.Tensor (TorchLean.Floats.NF β fexp rnd) s₃} {eps₃ : } (h₁ : approxT toSpec t₁S t₁R eps₁) (h₂ : approxT toSpec t₂S t₂R eps₂) (h₃ : approxT toSpec t₃S t₃R eps₃) (hab : a.i b.i) (hac : a.i c.i) (hbc : b.i c.i) :
                    approxCtx toSpec (TList.set3IdxNe a t₁S b t₂S c t₃S hab hac hbc) (TList.set3IdxNe a t₁R b t₂R c t₃R hab hac hbc) (EList.set3IdxNe a eps₁ b eps₂ c eps₃ hab hac hbc)

                    Context-wise addition bound (NF runtime vs spec).

                    This produces an EList of linf_norm bounds for adding two contexts elementwise, and is used when reverse-mode accumulation must combine contributions from multiple consumers.

                    Instances For
                      theorem Proofs.RuntimeApprox.NFBackend.approxCtx_add {β : TorchLean.Floats.NeuralRadix} {fexp : } [TorchLean.Floats.NeuralValidExp fexp] {rnd : } [TorchLean.Floats.NeuralValidRndToNearest rnd] {Δ : List Spec.Shape} (xS yS : TList Spec.SpecScalar Δ) (xR yR : TList (TorchLean.Floats.NF β fexp rnd) Δ) (epsx epsy : EList Δ) :
                      approxCtx toSpec xS xR epsxapproxCtx toSpec yS yR epsyapproxCtx toSpec (TList.add xS yS) (TList.add xR yR) (ctxAddBound epsx epsy xR yR)

                      Soundness of context-wise addition under approxCtx.

                      If xS ~ xR ± epsx and yS ~ yR ± epsy, then (xS + yS) ~ (xR + yR) with error bounded by ctxAddBound epsx epsy xR yR.

                      Reverse node for addition: z = a + b.

                      VJP is (δ ↦ (δ, δ)), with a special-case when a and b are the same context index.

                      Instances For

                        Reverse node for subtraction: z = a - b.

                        VJP is (δ ↦ (δ, -δ)), with a special-case when a and b are the same context index.

                        Instances For

                          Reverse node for multiplication: z = a * b.

                          VJP is (δ ↦ (δ*b, δ*a)), with rounding-aware bounds produced by the NF backend.

                          Instances For

                            Reverse node for scaling by a constant: z = c * a.

                            Instances For

                              Reverse node for negation: z = -a.

                              Instances For

                                Reverse node for exp.

                                Instances For

                                  Reverse node for tanh.

                                  Instances For

                                    Reverse node for sigmoid.

                                    Instances For

                                      Reverse node for softplus.

                                      Instances For

                                        Reverse node for a log with an explicit stabilization parameter ε (to avoid log 0-style issues).

                                        Instances For

                                          Reverse node for the scalar logistic-compatible softmax node, using the analytic ℝ derivative plus NF error bounds.

                                          Instances For

                                            Reverse node for ReLU, using the standard piecewise derivative/VJP.

                                            Instances For

                                              Reverse node for reduction sum, sending the upstream gradient back along the broadcasted shape.

                                              Instances For

                                                Reverse node for matrix-vector multiplication (mat_vec_mul_spec).

                                                VJP uses the standard adjoint identities: δW = δ ⊗ x and δx = Wᵀ δ (expressed in tensor form), with NF error bounds layered over the primitive ops.

                                                Instances For

                                                  Reverse node for matrix multiplication (mat_mul_spec).

                                                  VJP uses the standard identities δA = δC * Bᵀ and δB = Aᵀ * δC (in appropriate shapes), with NF error bounds layered over the primitive ops.

                                                  Instances For
                                                    theorem Proofs.RuntimeApprox.NFBackend.backprop_approx {β : TorchLean.Floats.NeuralRadix} {fexp : } [TorchLean.Floats.NeuralValidExp fexp] {rnd : } [TorchLean.Floats.NeuralValidRndToNearest rnd] {Γ ss : List Spec.Shape} (g : RevGraph toSpec Γ ss) (xS : TList Spec.SpecScalar Γ) (xR : TList (TorchLean.Floats.NF β fexp rnd) Γ) (epsIn : EList Γ) (seedS : TList Spec.SpecScalar (Γ ++ ss)) (seedR : TList (TorchLean.Floats.NF β fexp rnd) (Γ ++ ss)) (epsSeed : EList (Γ ++ ss)) :
                                                    approxCtx toSpec xS xR epsInapproxCtx toSpec seedS seedR epsSeedapproxCtx toSpec (g.backpropSpec xS seedS) (g.backpropRuntime xR seedR) (g.backpropBounds epsIn xR epsSeed seedR fun {Δ : List Spec.Shape} => ctxAddBound)

                                                    End-to-end NF reverse-mode soundness for a well-typed reverse graph.

                                                    This is the main composition theorem: if each node in the graph has a sound RevNode instance, then the whole backpropagated context is an approxCtx enclosure of the spec backpropagation.