TorchLean API

NN.Proofs.Autograd.FDeriv.Core

FDeriv Core #

HasFDerivAt-level (analytic) soundness for the proved-correct autograd layer.

This file starts by connecting our tensor dot to the Euclidean-space inner product, then proves a first end-to-end theorem for a 2-layer MLP (Linear → ReLU → Linear):

Notes:

PyTorch correspondence / citations #

@[reducible, inline]

Abbreviation for the Euclidean-space equivalence Vec n ≃ (Fin n → ℝ).

Instances For

    Dot-product vs. Euclidean inner product #

    To connect OpSpecCorrect (stated with the tensor dot product) to fderiv and adjoints (stated with Euclidean inner products), we prove that Spec.dot agrees with inner after vectorization.

    @[simp]

    toVecE is defined via EuclideanSpace.equiv; this lemma exposes the underlying coordinates.

    @[simp]

    Coordinate evaluation of toVecE.

    For 1D scalar tensors, Spec.dot agrees with the Euclidean inner product on Vec n after converting via toVecE.

    Coordinate formula for tensor addition under Spec.toVec.

    Vectorization commutes with tensor addition.

    Vectorization commutes with elementwise mapping: toVecE (map_spec f t) is f applied to each coordinate of Spec.toVec t.

    Vectorization of relu_deriv_spec: the derivative mask is ReLU’s scalar derivative applied coordinatewise.

    View a matrix-shaped tensor W : Tensor ℝ (m×n) as a Mathlib Matrix (Fin m) (Fin n) ℝ.

    This is just the coordinate function Spec.get2.

    Instances For
      noncomputable def Proofs.Autograd.matCLM {m n : } (W : Matrix (Fin m) (Fin n) ) :

      The matrix–vector multiplication map as a continuous linear map on Euclidean vectors.

      This is the Euclidean-space version of the tensor op mat_vec_mul_spec.

      Instances For

        Vectorization commutes with matrix–vector multiplication: toVecE (mat_vec_mul_spec A v) = (matCLM (tensorToMatrix A)) (toVecE v).

        noncomputable def Proofs.Autograd.affine {inDim outDim : } (W : Matrix (Fin outDim) (Fin inDim) ) (b : Vec outDim) :
        Vec inDimVec outDim

        Affine map x ↦ W x + b on Euclidean vectors.

        This is the vector-space analogue of Spec.linear_spec.

        Instances For
          theorem Proofs.Autograd.hasFDerivAt_affine {inDim outDim : } (W : Matrix (Fin outDim) (Fin inDim) ) (b : Vec outDim) (x : Vec inDim) :

          affine is Fréchet-differentiable with derivative W (as a CLM), since it is linear + constant.

          Vectorization of Spec.linear_spec is the Euclidean affine map built from the same weights/bias.

          def Proofs.Autograd.reluFun {n : } (x : Fin n) :
          Fin n

          Coordinatewise ReLU on Fin n → ℝ (function-space representation).

          Instances For
            noncomputable def Proofs.Autograd.reluVec {n : } (x : Vec n) :
            Vec n

            ReLU as a map on Euclidean vectors (coordinatewise max x 0).

            This is the Euclidean-space analogue of Spec.relu_op.forward.

            Instances For
              noncomputable def Proofs.Autograd.reluFunDeriv {n : } (x : Fin n) :
              (Fin n) →L[] Fin n

              Derivative candidate for the coordinatewise ReLU function on Fin n → ℝ, expressed as a diagonal scaling map by the scalar derivative mask.

              Instances For
                noncomputable def Proofs.Autograd.reluDerivCLM {n : } (x : Vec n) :

                Transport reluFunDeriv to Vec n via EuclideanSpace.equiv.

                Instances For

                  ReLU is not differentiable at 0. We therefore assume a “no kinks” hypothesis that every coordinate of x is nonzero.

                  theorem Proofs.Autograd.hasFDerivAt_reluVec {n : } (x : Vec n) (hx : ∀ (i : Fin n), x.ofLp i 0) :
                  noncomputable def Proofs.Autograd.mlpVec {inDim hidDim outDim : } (l1 : Spec.LinearSpec inDim hidDim) (l2 : Spec.LinearSpec hidDim outDim) :
                  Vec inDimVec outDim

                  2-layer MLP forward map on Euclidean vectors:

                  x ↦ affine W2 b2 (relu (affine W1 b1 x)).

                  Instances For
                    noncomputable def Proofs.Autograd.mlpDeriv {inDim hidDim outDim : } (l1 : Spec.LinearSpec inDim hidDim) (l2 : Spec.LinearSpec hidDim outDim) (x : Vec inDim) :
                    Vec inDim →L[] Vec outDim

                    Closed-form derivative (as a continuous linear map) of mlpVec at x.

                    This is the chain rule composition: W2 ∘ ReLU'(z1) ∘ W1.

                    Instances For
                      theorem Proofs.Autograd.hasFDerivAt_mlpVec {inDim hidDim outDim : } (l1 : Spec.LinearSpec inDim hidDim) (l2 : Spec.LinearSpec hidDim outDim) (x : Vec inDim) (hx : ∀ (i : Fin hidDim), have W1 := tensorToMatrix l1.weights; have b1 := toVecE l1.bias; (affine W1 b1 x).ofLp i 0) :
                      HasFDerivAt (mlpVec l1 l2) (mlpDeriv l1 l2 x) x

                      Fréchet differentiability of the 2-layer MLP (Linear → ReLU → Linear) under a “no kinks” hypothesis.

                      Because ReLU is not differentiable at 0, we assume all pre-activation coordinates z1ᵢ are nonzero.

                      noncomputable def Proofs.Autograd.mlpOp {inDim hidDim outDim : } (l1 : Spec.LinearSpec inDim hidDim) (l2 : Spec.LinearSpec hidDim outDim) :

                      The spec-level MLP as a composed Spec.OpSpec:

                      linear l1 then relu then linear l2.

                      Instances For
                        noncomputable def Proofs.Autograd.mlpCorrect {inDim hidDim outDim : } (l1 : Spec.LinearSpec inDim hidDim) (l2 : Spec.LinearSpec hidDim outDim) :

                        The proved-correct MLP OpSpecCorrect, built by composing the primitive correctness lemmas.

                        This provides the dot-level adjointness statement: ⟪JVP,δ⟫ = ⟪dx,VJP⟫.

                        Instances For
                          theorem Proofs.Autograd.toVec_mlp_jvp {inDim hidDim outDim : } (l1 : Spec.LinearSpec inDim hidDim) (l2 : Spec.LinearSpec hidDim outDim) (x dx : Spec.Tensor (Spec.Shape.dim inDim Spec.Shape.scalar)) :
                          toVecE ((mlpCorrect l1 l2).jvp x dx) = (mlpDeriv l1 l2 (toVecE x)) (toVecE dx)

                          Identify the OpSpecCorrect JVP for the MLP with the analytic derivative mlpDeriv, after vectorizing tensors to Euclidean vectors.

                          theorem Proofs.Autograd.mlp_backward_eq_adjoint_fderiv {inDim hidDim outDim : } (l1 : Spec.LinearSpec inDim hidDim) (l2 : Spec.LinearSpec hidDim outDim) (x : Spec.Tensor (Spec.Shape.dim inDim Spec.Shape.scalar)) (hx : ∀ (i : Fin hidDim), have W1 := tensorToMatrix l1.weights; have b1 := toVecE l1.bias; (affine W1 b1 (toVecE x)).ofLp i 0) (δ : Spec.Tensor (Spec.Shape.dim outDim Spec.Shape.scalar)) :
                          toVecE ((mlpOp l1 l2).backward x δ) = (vjp (mlpVec l1 l2) (toVecE x)) (toVecE δ)

                          End-to-end analytic soundness for the 2-layer MLP OpSpec:

                          the OpSpec.backward returned by the spec-level reverse-mode rule equals the adjoint of the true Fréchet derivative of the forward map (i.e. the analytic VJP), after vectorization.

                          This is the proof-side analogue of PyTorch’s claim that loss.backward() computes the correct VJP for the composed model, assuming the primitive backward rules are correct.