TorchLean API

Docs Home Guide Examples Graphs

NN.Proofs.RuntimeApprox.NF.ConvBackward

Conv2D Backward Approximation #

NF (rounded) backend: Conv2D backward (VJP) runtime→spec approximation.

This file proves soundness of explicit bounds for the three Conv2D gradients computed by Spec.conv2d_backward_spec:

kernel gradient
bias gradient
input gradient

The file is intentionally explicit because each gradient has a different nested-indexing pattern. The important public objects are the tensor-level bounds (conv2d*BoundTensor), the approximation theorems (approxT_conv2d_*_deriv_spec), and conv2dRevNode, which packages Conv2D as a RevNode so it composes via RevGraph.backprop_approx.

PyTorch analogue: these are the VJP/gradient computations produced by Autograd for Conv2D. https://pytorch.org/docs/stable/autograd.html https://pytorch.org/docs/stable/generated/torch.nn.functional.conv2d.html

Map of this file #

Shared padding/read lemmas (to relate the padded-input branches to the original approxT hypothesis).
Bias gradient bounds: conv2dBiasPointBound, approx_conv2d_bias_point, and tensor-lifted bound.
Kernel gradient bounds: conv2dKernelPointBound, approx_conv2d_kernel_point, and tensor-lifted bound.
Input gradient bounds: conv2dInputPointBound, approx_conv2d_input_point, and tensor-lifted bound.
conv2dRevNode: packaging as a RevNode so the bound composes inside larger graphs.

References #

Dumoulin & Visin, A guide to convolution arithmetic for deep learning (indexing/stride/padding conventions).
Goodfellow, Bengio, Courville, Deep Learning (MIT Press, 2016), convolution/backprop background.
Baydin et al., Automatic Differentiation in Machine Learning: a Survey (JMLR 2018) (VJP framing).

theorem Proofs.RuntimeApprox.NFBackend.entry_eq_scalar_get_at_or_zero1 {α : Type} [Zero α] {n : ℕ} (t : Spec.Tensor α (Spec.Shape.dim n Spec.Shape.scalar)) (i : Fin n) :

(match t with | Spec.Tensor.dim f => f i) = Spec.Tensor.scalar (Spec.getAtOrZero t [↑i])

theorem Proofs.RuntimeApprox.NFBackend.entry_eq_scalar_get_at_or_zero4 {α : Type} [Zero α] {n1 n2 n3 n4 : ℕ} (t : Spec.Tensor α (Spec.Shape.dim n1 (Spec.Shape.dim n2 (Spec.Shape.dim n3 (Spec.Shape.dim n4 Spec.Shape.scalar))))) (i1 : Fin n1) (i2 : Fin n2) (i3 : Fin n3) (i4 : Fin n4) :

(match match match match t with | Spec.Tensor.dim f => f i1 with | Spec.Tensor.dim g => g i2 with | Spec.Tensor.dim h => h i3 with | Spec.Tensor.dim k => k i4) = Spec.Tensor.scalar (Spec.getAtOrZero t [↑i1, ↑i2, ↑i3, ↑i4])

noncomputable def Proofs.RuntimeApprox.NFBackend.conv2dBiasPointBound {β : TorchLean.Floats.NeuralRadix} {fexp : ℤ → ℤ} [TorchLean.Floats.NeuralValidExp fexp] {rnd : ℝ → ℤ} {outC kH kW stride padding inH inW : ℕ} (δR : Spec.MultiChannelImage outC (conv2dOutH inH kH stride padding) (conv2dOutW inW kW stride padding) (TorchLean.Floats.NF β fexp rnd)) (epsδ : ℝ) (out_ch : Fin outC) :

Pointwise error bound for the Conv2D bias gradient (NF runtime vs spec).

This bound is a replay of the bias-gradient summation with per-term error epsδ coming from the grad_output approximation hypothesis.

Instances For

theorem Proofs.RuntimeApprox.NFBackend.approx_conv2d_bias_point {β : TorchLean.Floats.NeuralRadix} {fexp : ℤ → ℤ} [TorchLean.Floats.NeuralValidExp fexp] {rnd : ℝ → ℤ} [TorchLean.Floats.NeuralValidRndToNearest rnd] {inC outC kH kW stride padding inH inW : ℕ} {h1 : inC ≠ 0} {h2 : kH ≠ 0} {h3 : kW ≠ 0} {kernelS : Spec.Tensor ℝ (Spec.Shape.dim outC (Spec.Shape.dim inC (Spec.Shape.dim kH (Spec.Shape.dim kW Spec.Shape.scalar))))} {kernelR : Spec.Tensor (TorchLean.Floats.NF β fexp rnd) (Spec.Shape.dim outC (Spec.Shape.dim inC (Spec.Shape.dim kH (Spec.Shape.dim kW Spec.Shape.scalar))))} {biasS : Spec.Tensor ℝ (Spec.Shape.dim outC Spec.Shape.scalar)} {biasR : Spec.Tensor (TorchLean.Floats.NF β fexp rnd) (Spec.Shape.dim outC Spec.Shape.scalar)} {inputS : Spec.MultiChannelImage inC inH inW ℝ} {inputR : Spec.MultiChannelImage inC inH inW (TorchLean.Floats.NF β fexp rnd)} {δS : Spec.MultiChannelImage outC (conv2dOutH inH kH stride padding) (conv2dOutW inW kW stride padding) ℝ} {δR : Spec.MultiChannelImage outC (conv2dOutH inH kH stride padding) (conv2dOutW inW kW stride padding) (TorchLean.Floats.NF β fexp rnd)} {epsδ : ℝ} (hδ : approxT toSpec δS δR epsδ) (out_ch : Fin outC) :

have layerS := { kernel := kernelS, bias := biasS }; have layerR := { kernel := kernelR, bias := biasR }; |toSpec (Spec.getAtOrZero (Spec.conv2dBiasDerivSpec layerR inputR δR) [↑out_ch]) - Spec.getAtOrZero (Spec.conv2dBiasDerivSpec layerS inputS δS) [↑out_ch]| ≤ conv2dBiasPointBound δR epsδ out_ch

Soundness of the Conv2D bias-gradient pointwise bound.

Given approxT for grad_output, this shows the spec bias-gradient entry is approximated by the NF runtime entry within conv2dBiasPointBound.

noncomputable def Proofs.RuntimeApprox.NFBackend.conv2dBiasBoundTensor {β : TorchLean.Floats.NeuralRadix} {fexp : ℤ → ℤ} [TorchLean.Floats.NeuralValidExp fexp] {rnd : ℝ → ℤ} {outC kH kW stride padding inH inW : ℕ} (δR : Spec.MultiChannelImage outC (conv2dOutH inH kH stride padding) (conv2dOutW inW kW stride padding) (TorchLean.Floats.NF β fexp rnd)) (epsδ : ℝ) :

Spec.Tensor ℝ (Spec.Shape.dim outC Spec.Shape.scalar)

Tensor-shaped bias-gradient bound.

This packages conv2dBiasPointBound into a Tensor so later approxT statements can use linfNorm to obtain a single scalar error bound.

Instances For

noncomputable def Proofs.RuntimeApprox.NFBackend.conv2dKernelPointBound {β : TorchLean.Floats.NeuralRadix} {fexp : ℤ → ℤ} [TorchLean.Floats.NeuralValidExp fexp] {rnd : ℝ → ℤ} {inC outC kH kW stride padding inH inW : ℕ} (inputR : Spec.MultiChannelImage inC inH inW (TorchLean.Floats.NF β fexp rnd)) (δR : Spec.MultiChannelImage outC (conv2dOutH inH kH stride padding) (conv2dOutW inW kW stride padding) (TorchLean.Floats.NF β fexp rnd)) (epsX epsδ : ℝ) (out_ch : Fin outC) (in_ch : Fin inC) (di : Fin kH) (dj : Fin kW) :

Pointwise error bound for the Conv2D kernel gradient (NF runtime vs spec).

The kernel-gradient entry accumulates products of padded input values and upstream gradients. The bound is a replay of this accumulation with per-term errors derived from epsX and epsδ.

Instances For

theorem Proofs.RuntimeApprox.NFBackend.approx_conv2d_kernel_point {β : TorchLean.Floats.NeuralRadix} {fexp : ℤ → ℤ} [TorchLean.Floats.NeuralValidExp fexp] {rnd : ℝ → ℤ} [TorchLean.Floats.NeuralValidRndToNearest rnd] {inC outC kH kW stride padding inH inW : ℕ} {h1 : inC ≠ 0} {h2 : kH ≠ 0} {h3 : kW ≠ 0} {kernelS : Spec.Tensor ℝ (Spec.Shape.dim outC (Spec.Shape.dim inC (Spec.Shape.dim kH (Spec.Shape.dim kW Spec.Shape.scalar))))} {kernelR : Spec.Tensor (TorchLean.Floats.NF β fexp rnd) (Spec.Shape.dim outC (Spec.Shape.dim inC (Spec.Shape.dim kH (Spec.Shape.dim kW Spec.Shape.scalar))))} {biasS : Spec.Tensor ℝ (Spec.Shape.dim outC Spec.Shape.scalar)} {biasR : Spec.Tensor (TorchLean.Floats.NF β fexp rnd) (Spec.Shape.dim outC Spec.Shape.scalar)} {inputS : Spec.MultiChannelImage inC inH inW ℝ} {inputR : Spec.MultiChannelImage inC inH inW (TorchLean.Floats.NF β fexp rnd)} {δS : Spec.MultiChannelImage outC (conv2dOutH inH kH stride padding) (conv2dOutW inW kW stride padding) ℝ} {δR : Spec.MultiChannelImage outC (conv2dOutH inH kH stride padding) (conv2dOutW inW kW stride padding) (TorchLean.Floats.NF β fexp rnd)} {epsX epsδ : ℝ} (hX : approxT toSpec inputS inputR epsX) (hδ : approxT toSpec δS δR epsδ) (out_ch : Fin outC) (in_ch : Fin inC) (di : Fin kH) (dj : Fin kW) :

have layerS := { kernel := kernelS, bias := biasS }; have layerR := { kernel := kernelR, bias := biasR }; |toSpec (Spec.getAtOrZero (Spec.conv2dKernelDerivSpec layerR inputR δR) [↑out_ch, ↑in_ch, ↑di, ↑dj]) - Spec.getAtOrZero (Spec.conv2dKernelDerivSpec layerS inputS δS) [↑out_ch, ↑in_ch, ↑di, ↑dj]| ≤ conv2dKernelPointBound inputR δR epsX epsδ out_ch in_ch di dj

Soundness of the Conv2D kernel-gradient pointwise bound.

Given approxT hypotheses for the input and upstream gradient (grad_output), this shows the spec kernel-gradient entry is approximated by the NF runtime entry within conv2dKernelPointBound.

noncomputable def Proofs.RuntimeApprox.NFBackend.conv2dKernelBoundTensor {β : TorchLean.Floats.NeuralRadix} {fexp : ℤ → ℤ} [TorchLean.Floats.NeuralValidExp fexp] {rnd : ℝ → ℤ} {inC outC kH kW stride padding inH inW : ℕ} (inputR : Spec.MultiChannelImage inC inH inW (TorchLean.Floats.NF β fexp rnd)) (δR : Spec.MultiChannelImage outC (conv2dOutH inH kH stride padding) (conv2dOutW inW kW stride padding) (TorchLean.Floats.NF β fexp rnd)) (epsX epsδ : ℝ) :

Spec.Tensor ℝ (Spec.Shape.dim outC (Spec.Shape.dim inC (Spec.Shape.dim kH (Spec.Shape.dim kW Spec.Shape.scalar))))

Tensor-shaped kernel-gradient bound.

This packages conv2dKernelPointBound into the full 4D kernel-tensor shape so later approxT lemmas can use a single global bound via linfNorm.

Instances For

theorem Proofs.RuntimeApprox.NFBackend.approxT_conv2d_bias_deriv_spec {β : TorchLean.Floats.NeuralRadix} {fexp : ℤ → ℤ} [TorchLean.Floats.NeuralValidExp fexp] {rnd : ℝ → ℤ} [TorchLean.Floats.NeuralValidRndToNearest rnd] {inC outC kH kW stride padding inH inW : ℕ} {h1 : inC ≠ 0} {h2 : kH ≠ 0} {h3 : kW ≠ 0} {kernelS : Spec.Tensor ℝ (Spec.Shape.dim outC (Spec.Shape.dim inC (Spec.Shape.dim kH (Spec.Shape.dim kW Spec.Shape.scalar))))} {kernelR : Spec.Tensor (TorchLean.Floats.NF β fexp rnd) (Spec.Shape.dim outC (Spec.Shape.dim inC (Spec.Shape.dim kH (Spec.Shape.dim kW Spec.Shape.scalar))))} {biasS : Spec.Tensor ℝ (Spec.Shape.dim outC Spec.Shape.scalar)} {biasR : Spec.Tensor (TorchLean.Floats.NF β fexp rnd) (Spec.Shape.dim outC Spec.Shape.scalar)} {inputS : Spec.MultiChannelImage inC inH inW ℝ} {inputR : Spec.MultiChannelImage inC inH inW (TorchLean.Floats.NF β fexp rnd)} {δS : Spec.MultiChannelImage outC (conv2dOutH inH kH stride padding) (conv2dOutW inW kW stride padding) ℝ} {δR : Spec.MultiChannelImage outC (conv2dOutH inH kH stride padding) (conv2dOutW inW kW stride padding) (TorchLean.Floats.NF β fexp rnd)} {epsδ : ℝ} (hδ : approxT toSpec δS δR epsδ) :

have layerS := { kernel := kernelS, bias := biasS }; have layerR := { kernel := kernelR, bias := biasR }; have outS := Spec.conv2dBiasDerivSpec layerS inputS δS; have outR := Spec.conv2dBiasDerivSpec layerR inputR δR; have bT := conv2dBiasBoundTensor δR epsδ; approxT toSpec outS outR (linfNorm bT)

Tensor-level approxT bound for the Conv2D bias gradient.

This lifts approx_conv2d_bias_point entrywise and packages the error into linfNorm (conv2dBiasBoundTensor ...).

theorem Proofs.RuntimeApprox.NFBackend.approxT_conv2d_kernel_deriv_spec {β : TorchLean.Floats.NeuralRadix} {fexp : ℤ → ℤ} [TorchLean.Floats.NeuralValidExp fexp] {rnd : ℝ → ℤ} [TorchLean.Floats.NeuralValidRndToNearest rnd] {inC outC kH kW stride padding inH inW : ℕ} {h1 : inC ≠ 0} {h2 : kH ≠ 0} {h3 : kW ≠ 0} {kernelS : Spec.Tensor ℝ (Spec.Shape.dim outC (Spec.Shape.dim inC (Spec.Shape.dim kH (Spec.Shape.dim kW Spec.Shape.scalar))))} {kernelR : Spec.Tensor (TorchLean.Floats.NF β fexp rnd) (Spec.Shape.dim outC (Spec.Shape.dim inC (Spec.Shape.dim kH (Spec.Shape.dim kW Spec.Shape.scalar))))} {biasS : Spec.Tensor ℝ (Spec.Shape.dim outC Spec.Shape.scalar)} {biasR : Spec.Tensor (TorchLean.Floats.NF β fexp rnd) (Spec.Shape.dim outC Spec.Shape.scalar)} {inputS : Spec.MultiChannelImage inC inH inW ℝ} {inputR : Spec.MultiChannelImage inC inH inW (TorchLean.Floats.NF β fexp rnd)} {δS : Spec.MultiChannelImage outC (conv2dOutH inH kH stride padding) (conv2dOutW inW kW stride padding) ℝ} {δR : Spec.MultiChannelImage outC (conv2dOutH inH kH stride padding) (conv2dOutW inW kW stride padding) (TorchLean.Floats.NF β fexp rnd)} {epsX epsδ : ℝ} (hX : approxT toSpec inputS inputR epsX) (hδ : approxT toSpec δS δR epsδ) :

have layerS := { kernel := kernelS, bias := biasS }; have layerR := { kernel := kernelR, bias := biasR }; have outS := Spec.conv2dKernelDerivSpec layerS inputS δS; have outR := Spec.conv2dKernelDerivSpec layerR inputR δR; have bT := conv2dKernelBoundTensor inputR δR epsX epsδ; approxT toSpec outS outR (linfNorm bT)

Tensor-level approxT bound for the Conv2D kernel gradient.

This lifts approx_conv2d_kernel_point entrywise and packages the error into linfNorm (conv2dKernelBoundTensor ...).

noncomputable def Proofs.RuntimeApprox.NFBackend.conv2dInputPointBound {β : TorchLean.Floats.NeuralRadix} {fexp : ℤ → ℤ} [TorchLean.Floats.NeuralValidExp fexp] {rnd : ℝ → ℤ} {inC outC kH kW stride padding inH inW : ℕ} (kernelR : Spec.Tensor (TorchLean.Floats.NF β fexp rnd) (Spec.Shape.dim outC (Spec.Shape.dim inC (Spec.Shape.dim kH (Spec.Shape.dim kW Spec.Shape.scalar))))) (δR : Spec.MultiChannelImage outC (conv2dOutH inH kH stride padding) (conv2dOutW inW kW stride padding) (TorchLean.Floats.NF β fexp rnd)) (epsK epsδ : ℝ) (in_ch : Fin inC) (i : Fin inH) (j : Fin inW) :

Pointwise error bound for the Conv2D input gradient (NF runtime vs spec).

The input-gradient entry accumulates contributions from all output channels and spatial positions that “hit” the given input coordinate under the stride/padding relation. The bound is a replay of that accumulation with per-term errors derived from epsK and epsδ.

Instances For

noncomputable def Proofs.RuntimeApprox.NFBackend.conv2dInputBoundTensor {β : TorchLean.Floats.NeuralRadix} {fexp : ℤ → ℤ} [TorchLean.Floats.NeuralValidExp fexp] {rnd : ℝ → ℤ} {inC outC kH kW stride padding inH inW : ℕ} (kernelR : Spec.Tensor (TorchLean.Floats.NF β fexp rnd) (Spec.Shape.dim outC (Spec.Shape.dim inC (Spec.Shape.dim kH (Spec.Shape.dim kW Spec.Shape.scalar))))) (δR : Spec.MultiChannelImage outC (conv2dOutH inH kH stride padding) (conv2dOutW inW kW stride padding) (TorchLean.Floats.NF β fexp rnd)) (epsK epsδ : ℝ) :

Spec.MultiChannelImage inC inH inW ℝ

Tensor-shaped input-gradient bound.

This packages conv2dInputPointBound into the full input image shape so later approxT lemmas can use a single global bound via linfNorm.

Instances For

theorem Proofs.RuntimeApprox.NFBackend.approx_conv2d_input_point {β : TorchLean.Floats.NeuralRadix} {fexp : ℤ → ℤ} [TorchLean.Floats.NeuralValidExp fexp] {rnd : ℝ → ℤ} [TorchLean.Floats.NeuralValidRndToNearest rnd] {inC outC kH kW stride padding inH inW : ℕ} {h1 : inC ≠ 0} {h2 : kH ≠ 0} {h3 : kW ≠ 0} {kernelS : Spec.Tensor ℝ (Spec.Shape.dim outC (Spec.Shape.dim inC (Spec.Shape.dim kH (Spec.Shape.dim kW Spec.Shape.scalar))))} {kernelR : Spec.Tensor (TorchLean.Floats.NF β fexp rnd) (Spec.Shape.dim outC (Spec.Shape.dim inC (Spec.Shape.dim kH (Spec.Shape.dim kW Spec.Shape.scalar))))} {biasS : Spec.Tensor ℝ (Spec.Shape.dim outC Spec.Shape.scalar)} {biasR : Spec.Tensor (TorchLean.Floats.NF β fexp rnd) (Spec.Shape.dim outC Spec.Shape.scalar)} {inputS : Spec.MultiChannelImage inC inH inW ℝ} {inputR : Spec.MultiChannelImage inC inH inW (TorchLean.Floats.NF β fexp rnd)} {δS : Spec.MultiChannelImage outC (conv2dOutH inH kH stride padding) (conv2dOutW inW kW stride padding) ℝ} {δR : Spec.MultiChannelImage outC (conv2dOutH inH kH stride padding) (conv2dOutW inW kW stride padding) (TorchLean.Floats.NF β fexp rnd)} {epsK epsδ : ℝ} (hK : approxT toSpec kernelS kernelR epsK) (hδ : approxT toSpec δS δR epsδ) (in_ch : Fin inC) (i : Fin inH) (j : Fin inW) :

have layerS := { kernel := kernelS, bias := biasS }; have layerR := { kernel := kernelR, bias := biasR }; |toSpec (Spec.getAtOrZero (Spec.conv2dInputDerivSpec layerR inputR δR) [↑in_ch, ↑i, ↑j]) - Spec.getAtOrZero (Spec.conv2dInputDerivSpec layerS inputS δS) [↑in_ch, ↑i, ↑j]| ≤ conv2dInputPointBound kernelR δR epsK epsδ in_ch i j

Soundness of the Conv2D input-gradient pointwise bound.

Given approxT hypotheses for the kernel and upstream gradient (grad_output), this shows the spec input-gradient entry is approximated by the NF runtime entry within conv2dInputPointBound.

theorem Proofs.RuntimeApprox.NFBackend.approxT_conv2d_input_deriv_spec {β : TorchLean.Floats.NeuralRadix} {fexp : ℤ → ℤ} [TorchLean.Floats.NeuralValidExp fexp] {rnd : ℝ → ℤ} [TorchLean.Floats.NeuralValidRndToNearest rnd] {inC outC kH kW stride padding inH inW : ℕ} {h1 : inC ≠ 0} {h2 : kH ≠ 0} {h3 : kW ≠ 0} {kernelS : Spec.Tensor ℝ (Spec.Shape.dim outC (Spec.Shape.dim inC (Spec.Shape.dim kH (Spec.Shape.dim kW Spec.Shape.scalar))))} {kernelR : Spec.Tensor (TorchLean.Floats.NF β fexp rnd) (Spec.Shape.dim outC (Spec.Shape.dim inC (Spec.Shape.dim kH (Spec.Shape.dim kW Spec.Shape.scalar))))} {biasS : Spec.Tensor ℝ (Spec.Shape.dim outC Spec.Shape.scalar)} {biasR : Spec.Tensor (TorchLean.Floats.NF β fexp rnd) (Spec.Shape.dim outC Spec.Shape.scalar)} {inputS : Spec.MultiChannelImage inC inH inW ℝ} {inputR : Spec.MultiChannelImage inC inH inW (TorchLean.Floats.NF β fexp rnd)} {δS : Spec.MultiChannelImage outC (conv2dOutH inH kH stride padding) (conv2dOutW inW kW stride padding) ℝ} {δR : Spec.MultiChannelImage outC (conv2dOutH inH kH stride padding) (conv2dOutW inW kW stride padding) (TorchLean.Floats.NF β fexp rnd)} {epsK epsδ : ℝ} (hK : approxT toSpec kernelS kernelR epsK) (hδ : approxT toSpec δS δR epsδ) :

have layerS := { kernel := kernelS, bias := biasS }; have layerR := { kernel := kernelR, bias := biasR }; have outS := Spec.conv2dInputDerivSpec layerS inputS δS; have outR := Spec.conv2dInputDerivSpec layerR inputR δR; have bT := conv2dInputBoundTensor kernelR δR epsK epsδ; approxT toSpec outS outR (linfNorm bT)

Tensor-level approxT bound for the Conv2D input gradient.

This lifts approx_conv2d_input_point entrywise and packages the error into linfNorm (conv2dInputBoundTensor ...).

theorem Proofs.RuntimeApprox.NFBackend.idx_i_ne_of_shape_ne {Γ : List Spec.Shape} {s₁ s₂ : Spec.Shape} (a : Idx Γ s₁) (b : Idx Γ s₂) (hs : s₁ ≠ s₂) :

a.i ≠ b.i

noncomputable def Proofs.RuntimeApprox.NFBackend.conv2dRevNode {β : TorchLean.Floats.NeuralRadix} {fexp : ℤ → ℤ} [TorchLean.Floats.NeuralValidExp fexp] {rnd : ℝ → ℤ} [TorchLean.Floats.NeuralValidRndToNearest rnd] {Γ : List Spec.Shape} {inC outC kH kW stride padding inH inW : ℕ} {h1 : inC ≠ 0} {h2 : kH ≠ 0} {h3 : kW ≠ 0} (kernelIdx : Idx Γ (Spec.Shape.dim outC (Spec.Shape.dim inC (Spec.Shape.dim kH (Spec.Shape.dim kW Spec.Shape.scalar))))) (biasIdx : Idx Γ (Spec.Shape.dim outC Spec.Shape.scalar)) (inputIdx : Idx Γ (Spec.Shape.dim inC (Spec.Shape.dim inH (Spec.Shape.dim inW Spec.Shape.scalar)))) :

RevNode toSpec Γ (Spec.Shape.dim outC (Spec.Shape.dim (conv2dOutH inH kH stride padding) (Spec.Shape.dim (conv2dOutW inW kW stride padding) Spec.Shape.scalar)))

Package Conv2D forward + backward (VJP) as a RevNode for RevGraph.backprop_approx.

The node uses the forward bound from conv2d_forward and the three gradient bounds proved in this file (kernel/bias/input) to provide a compositional reverse-mode approximation theorem.

Instances For