TorchLean API

NN.Spec.Layers.Conv

Conv (generic N-D, spec layer) #

This file defines a channels-first N-D convolution and transpose-convolution spec over a single sample (no batch dim), generalizing Conv1D/Conv2D/Conv3D and ConvTranspose{d}d to an arbitrary spatial rank d.

PyTorch analogy: this corresponds to torch.nn.Conv{d}d with:

For each axis a : Fin d:

out[a] = (in[a] + 2*padding[a] - kernel[a]) / stride[a] + 1

The weight tensor has shape (outC × inC × kernel[0] × ... × kernel[d-1]) and the bias has shape (outC).

Implementation notes:

Small generic helpers #

def Spec.Private.tensorOfFnList {α : Type} (dims : List ) (f : List α) :
Instances For
    def Spec.Private.foldlIndices {β : Type} (dims : List ) (init : β) (f : βList β) :
    β
    Instances For
      def Spec.Private.foldlIndices.go {β : Type} (f : βList β) (dims prefixRev : List ) (acc : β) :
      β
      Instances For
        def Spec.Private.mkInputIdx? (outIdx kIdx stride padding : List ) :

        Given:

        • an output index tuple outIdx,
        • a kernel index tuple kIdx,
        • per-axis stride and padding, compute the corresponding input index tuple (into the unpadded input), or return none if we are in the left/top/front padding region on some axis.

        Right/bottom/back padding is handled by get_at_or_zero when the computed index is out of bounds.

        Instances For
          def Spec.Private.mkTransposeInputIdx? (outIdx kIdx stride padding : List ) :

          Given:

          • an output index tuple outIdx,
          • a kernel index tuple kIdx,
          • per-axis stride and padding, compute the corresponding input index tuple for transpose convolution, or none if the equality out + padding = in * stride + k cannot be satisfied on some axis.

          Implementation detail: for each axis we solve

          in = (out + padding - k) / stride

          and require divisibility (% stride = 0) plus out + padding ≥ k. Out-of-bounds input indices are handled by get_at_or_zero at the call site.

          Instances For
            def Spec.Private.matchesInputPos (outIdx kIdx stride padding inIdx : List ) :
            Instances For

              Spec definition #

              structure Spec.ConvSpec (d inC outC : ) (kernel stride padding : Vector d) (α : Type) :

              Parameters for a generic N-D convolution (weights + bias), channels-first.

              Instances For
                def Spec.convOutDim (inDim kDim stride padding : ) :

                Output size along a single axis (PyTorch formula, floor division).

                Instances For
                  def Spec.convOutSpatial {d : } (inSpatial kernel stride padding : Vector d) :

                  Output spatial sizes (Vector Nat d).

                  Instances For
                    def Spec.convOutShape {d : } (inSpatial kernel stride padding : Vector d) :

                    Output spatial shape Shape.ofList [out0, ..., out(d-1)].

                    Instances For
                      def Spec.convMultiOutShape {d : } (_inC outC : ) (inSpatial kernel stride padding : Vector d) :

                      Output shape including channels: Shape.ofList (outC :: [out0, ..., out(d-1)]).

                      Instances For
                        def Spec.convSpec {α : Type} [Context α] {d inC outC : } {kernel stride padding inSpatial : Vector d} (layer : ConvSpec d inC outC kernel stride padding α) (input : Tensor α (Shape.ofList (inC :: inSpatial.toList))) :
                        Tensor α (Shape.ofList (outC :: (convOutSpatial inSpatial kernel stride padding).toList))

                        Generic N-D convolution forward pass on a single channels-first input (no batch dimension).

                        Mathematically, for output channel oc and output spatial index o : Vector Nat d:

                        y[oc,o] = Σ_{ic, k} x_pad[ic, o*stride + k] * W[oc,ic,k] + b[oc]

                        where k ranges over the kernel window and x_pad is input with zero-padding.

                        Instances For
                          def Spec.convKernelDerivSpec {α : Type} [Context α] {d inC outC : } {kernel stride padding inSpatial : Vector d} (_layer : ConvSpec d inC outC kernel stride padding α) (input : Tensor α (Shape.ofList (inC :: inSpatial.toList))) (grad_output : Tensor α (Shape.ofList (outC :: (convOutSpatial inSpatial kernel stride padding).toList))) :
                          Tensor α (Shape.ofList (outC :: inC :: kernel.toList))

                          Gradient of convolution output w.r.t. the kernel weights (given grad_output).

                          Instances For
                            def Spec.convBiasDerivSpec {α : Type} [Context α] {d inC outC : } {kernel stride padding inSpatial : Vector d} (_layer : ConvSpec d inC outC kernel stride padding α) (_input : Tensor α (Shape.ofList (inC :: inSpatial.toList))) (grad_output : Tensor α (Shape.ofList (outC :: (convOutSpatial inSpatial kernel stride padding).toList))) :

                            Gradient of convolution output w.r.t. the bias (sum over spatial positions).

                            Instances For
                              def Spec.convInputDerivSpec {α : Type} [Context α] {d inC outC : } {kernel stride padding inSpatial : Vector d} (layer : ConvSpec d inC outC kernel stride padding α) (_input : Tensor α (Shape.ofList (inC :: inSpatial.toList))) (grad_output : Tensor α (Shape.ofList (outC :: (convOutSpatial inSpatial kernel stride padding).toList))) :
                              Tensor α (Shape.ofList (inC :: inSpatial.toList))

                              Gradient of convolution output w.r.t. the input (the "input-gradient" / transpose-convolution map).

                              This mirrors conv{1,2,3}d_input_deriv_spec but for arbitrary spatial rank d.

                              Instances For
                                def Spec.convBackwardSpec {α : Type} [Context α] {d inC outC : } {kernel stride padding inSpatial : Vector d} (layer : ConvSpec d inC outC kernel stride padding α) (input : Tensor α (Shape.ofList (inC :: inSpatial.toList))) (grad_output : Tensor α (Shape.ofList (outC :: (convOutSpatial inSpatial kernel stride padding).toList))) :
                                Tensor α (Shape.ofList (outC :: inC :: kernel.toList)) × Tensor α (Shape.dim outC Shape.scalar) × Tensor α (Shape.ofList (inC :: inSpatial.toList))

                                Convolution backward pass: returns (dKernel, dBias, dInput).

                                Instances For

                                  ConvTranspose (generic N-D) #

                                  structure Spec.ConvTransposeSpec (d inC outC : ) (kernel stride padding : Vector d) (α : Type) :

                                  Parameters for a generic N-D transpose convolution (weights + bias), channels-first.

                                  PyTorch analogy: this is torch.nn.ConvTranspose{d}d with:

                                  • output_padding = 0,
                                  • dilation = 1,
                                  • groups = 1,
                                  • per-axis stride and padding,
                                  • and weight layout (inC, outC, k0, ..., k(d-1)).
                                  Instances For
                                    def Spec.convTransposeOutDim (inDim kDim stride padding : ) :

                                    Output size along a single axis for transpose convolution (output_padding = 0).

                                    Instances For
                                      def Spec.convTransposeOutSpatial {d : } (inSpatial kernel stride padding : Vector d) :

                                      Output spatial sizes (Vector Nat d) for transpose convolution (output_padding = 0).

                                      Instances For
                                        def Spec.convTransposeOutShape {d : } (inSpatial kernel stride padding : Vector d) :

                                        Output spatial shape Shape.ofList [out0, ..., out(d-1)] (transpose convolution).

                                        Instances For
                                          def Spec.convTransposeMultiOutShape {d : } (_inC outC : ) (inSpatial kernel stride padding : Vector d) :

                                          Output shape including channels: Shape.ofList (outC :: [out0, ..., out(d-1)]).

                                          Instances For
                                            def Spec.convTransposeSpec {α : Type} [Context α] {d inC outC : } {kernel stride padding inSpatial : Vector d} (layer : ConvTransposeSpec d inC outC kernel stride padding α) (input : Tensor α (Shape.ofList (inC :: inSpatial.toList))) :
                                            Tensor α (Shape.ofList (outC :: (convTransposeOutSpatial inSpatial kernel stride padding).toList))

                                            Generic N-D transpose convolution forward pass on a single channels-first input (no batch dim).

                                            For output channel oc and output spatial index o : Vector Nat d we define:

                                            y[oc,o] = Σ_{ic, k} x[ic, (o + padding - k) / stride] * W[ic,oc,k] + b[oc]

                                            where each axis must satisfy out + padding ≥ k and divisibility by stride (% stride = 0).

                                            Instances For
                                              def Spec.convTransposeKernelDerivSpec {α : Type} [Context α] {d inC outC : } {kernel stride padding inSpatial : Vector d} (input : Tensor α (Shape.ofList (inC :: inSpatial.toList))) (grad_output : Tensor α (Shape.ofList (outC :: (convTransposeOutSpatial inSpatial kernel stride padding).toList))) :
                                              Tensor α (Shape.ofList (inC :: outC :: kernel.toList))

                                              Gradient of transpose convolution output w.r.t. the kernel weights (given grad_output).

                                              Instances For
                                                def Spec.convTransposeBiasDerivSpec {α : Type} [Context α] {d outC : } {kernel stride padding inSpatial : Vector d} (grad_output : Tensor α (Shape.ofList (outC :: (convTransposeOutSpatial inSpatial kernel stride padding).toList))) :

                                                Gradient of transpose convolution output w.r.t. the bias (sum over spatial positions).

                                                Instances For
                                                  def Spec.convTransposeInputDerivSpec {α : Type} [Context α] {d inC outC : } {kernel stride padding inSpatial : Vector d} (weights : Tensor α (Shape.ofList (inC :: outC :: kernel.toList))) (grad_output : Tensor α (Shape.ofList (outC :: (convTransposeOutSpatial inSpatial kernel stride padding).toList))) :
                                                  Tensor α (Shape.ofList (inC :: inSpatial.toList))

                                                  Gradient of transpose convolution output w.r.t. the input (given grad_output).

                                                  Instances For
                                                    def Spec.convTransposeBackwardSpec {α : Type} [Context α] {d inC outC : } {kernel stride padding inSpatial : Vector d} (layer : ConvTransposeSpec d inC outC kernel stride padding α) (input : Tensor α (Shape.ofList (inC :: inSpatial.toList))) (grad_output : Tensor α (Shape.ofList (outC :: (convTransposeOutSpatial inSpatial kernel stride padding).toList))) :
                                                    Tensor α (Shape.ofList (inC :: outC :: kernel.toList)) × Tensor α (Shape.dim outC Shape.scalar) × Tensor α (Shape.ofList (inC :: inSpatial.toList))

                                                    Transpose convolution backward pass: returns (dKernel, dBias, dInput).

                                                    Instances For

                                                      2D specializations (compat) #

                                                      TorchLean also exposes compatibility names for 2D convolution specs. They are implemented as specializations of the generic N-D convolution spec above.

                                                      @[reducible, inline]
                                                      abbrev Spec.Conv2DSpec (inC outC kH kW stride padding : ) (α : Type) (h1 : inC 0) (h2 : kH 0) (h3 : kW 0) :

                                                      Parameters for a 2D convolution: this is ConvSpec specialized to d = 2.

                                                      Instances For
                                                        def Spec.conv2dOutShape (inH inW kH kW stride padding : ) :

                                                        Output spatial shape (outH,outW) for a Conv2D with given hyperparameters.

                                                        Instances For
                                                          def Spec.conv2dMultiOutShape (_inC outC inH inW kH kW stride padding : ) :

                                                          Output shape including channels: (outC,outH,outW).

                                                          Instances For
                                                            theorem Spec.Private.conv2d_multi_out_shape_eq (outC inH inW kH kW stride padding : ) :
                                                            Shape.ofList (outC :: (convOutSpatial #v[inH, inW] #v[kH, kW] #v[stride, stride] #v[padding, padding]).toList) = Shape.dim outC (Shape.dim ((inH + 2 * padding - kH) / stride + 1) (Shape.dim ((inW + 2 * padding - kW) / stride + 1) Shape.scalar))
                                                            def Spec.conv2dSpec {α : Type} [Context α] {inC outC kH kW stride padding inH inW : } {h1 : inC 0} {h2 : kH 0} {h3 : kW 0} (layer : Conv2DSpec inC outC kH kW stride padding α h1 h2 h3) (input : MultiChannelImage inC inH inW α) :
                                                            MultiChannelImage outC ((inH + 2 * padding - kH) / stride + 1) ((inW + 2 * padding - kW) / stride + 1) α

                                                            Conv2D forward pass on a single image C×H×W (no batch dimension).

                                                            PyTorch note: this matches the usual Conv2d definition (cross-correlation form, i.e. the kernel is not spatially flipped).

                                                            Instances For
                                                              def Spec.conv2dKernelDerivSpec {α : Type} [Context α] {inC outC kH kW stride padding inH inW : } {h1 : inC 0} {h2 : kH 0} {h3 : kW 0} (layer : Conv2DSpec inC outC kH kW stride padding α h1 h2 h3) (input : MultiChannelImage inC inH inW α) (grad_output : MultiChannelImage outC ((inH + 2 * padding - kH) / stride + 1) ((inW + 2 * padding - kW) / stride + 1) α) :

                                                              Gradient of Conv2D output w.r.t. the kernel weights (given grad_output).

                                                              Instances For
                                                                def Spec.conv2dBiasDerivSpec {α : Type} [Context α] {inC outC kH kW stride padding inH inW : } {h1 : inC 0} {h2 : kH 0} {h3 : kW 0} (layer : Conv2DSpec inC outC kH kW stride padding α h1 h2 h3) (input : MultiChannelImage inC inH inW α) (grad_output : MultiChannelImage outC ((inH + 2 * padding - kH) / stride + 1) ((inW + 2 * padding - kW) / stride + 1) α) :

                                                                Gradient of Conv2D output w.r.t. the bias (sum over spatial positions).

                                                                Instances For
                                                                  def Spec.conv2dInputDerivSpec {α : Type} [Context α] {inC outC kH kW stride padding inH inW : } {h1 : inC 0} {h2 : kH 0} {h3 : kW 0} (layer : Conv2DSpec inC outC kH kW stride padding α h1 h2 h3) (input : MultiChannelImage inC inH inW α) (grad_output : MultiChannelImage outC ((inH + 2 * padding - kH) / stride + 1) ((inW + 2 * padding - kW) / stride + 1) α) :
                                                                  MultiChannelImage inC inH inW α

                                                                  Gradient of Conv2D output w.r.t. the input image.

                                                                  Instances For
                                                                    def Spec.conv2dBackwardSpec {α : Type} [Context α] {inC outC kH kW stride padding inH inW : } {h1 : inC 0} {h2 : kH 0} {h3 : kW 0} (layer : Conv2DSpec inC outC kH kW stride padding α h1 h2 h3) (input : MultiChannelImage inC inH inW α) (grad_output : MultiChannelImage outC ((inH + 2 * padding - kH) / stride + 1) ((inW + 2 * padding - kW) / stride + 1) α) :

                                                                    Conv2D backward pass: returns (dKernel, dBias, dInput).

                                                                    Instances For

                                                                      ConvTranspose2D (compat) #

                                                                      We keep the transpose-convolution names for compatibility, but implement them as specializations of the generic N-D transpose convolution spec above.

                                                                      @[reducible, inline]
                                                                      abbrev Spec.ConvTransposeKernel (outC inC kH kW : ) (α : Type) :

                                                                      Kernel layout for transpose-convolution: (inC, outC, kH, kW).

                                                                      Instances For
                                                                        structure Spec.ConvTranspose2DSpec (inC outC kH kW stride padding : ) (α : Type) (h1 : inC > 0) (h2 : kH 0) (h3 : kW 0) :

                                                                        Parameters for a 2D transpose convolution.

                                                                        Instances For
                                                                          def Spec.convTranspose2dOutShape (inH inW kH kW stride padding : ) :

                                                                          Output spatial shape (outH,outW) for ConvTranspose2d (with output_padding = 0).

                                                                          Instances For
                                                                            def Spec.convTranspose2dMultiOutShape (_inC outC inH inW kH kW stride padding : ) :

                                                                            Output shape including channels: (outC,outH,outW).

                                                                            Instances For
                                                                              theorem Spec.Private.conv_transpose2d_multi_out_shape_eq (outC inH inW kH kW stride padding : ) :
                                                                              Shape.ofList (outC :: (convTransposeOutSpatial #v[inH, inW] #v[kH, kW] #v[stride, stride] #v[padding, padding]).toList) = Shape.dim outC (Shape.dim ((inH - 1) * stride - 2 * padding + kH) (Shape.dim ((inW - 1) * stride - 2 * padding + kW) Shape.scalar))
                                                                              def Spec.convTranspose2dSpec {α : Type} [Context α] {inC outC kH kW stride padding inH inW : } {h1 : inC > 0} {h2 : kH 0} {h3 : kW 0} (layer : ConvTranspose2DSpec inC outC kH kW stride padding α h1 h2 h3) (input : MultiChannelImage inC inH inW α) :
                                                                              MultiChannelImage outC ((inH - 1) * stride - 2 * padding + kH) ((inW - 1) * stride - 2 * padding + kW) α

                                                                              ConvTranspose2D forward pass on a single image C×H×W (no batch dimension).

                                                                              This is written as an output-indexed sum (no in-place updates), matching the standard definition.

                                                                              Instances For
                                                                                def Spec.convTranspose2dWeightsDerivSpec {α : Type} [Context α] {inC outC kH kW stride padding inH inW : } {_h1 : inC > 0} {_h2 : kH 0} {_h3 : kW 0} (input : MultiChannelImage inC inH inW α) (grad_output : MultiChannelImage outC ((inH - 1) * stride - 2 * padding + kH) ((inW - 1) * stride - 2 * padding + kW) α) :
                                                                                ConvTransposeKernel outC inC kH kW α

                                                                                Gradient of ConvTranspose2D output w.r.t. the kernel weights.

                                                                                Instances For
                                                                                  def Spec.convTranspose2dBiasDerivSpec {α : Type} [Context α] {_inC outC kH kW stride padding inH inW : } (grad_output : MultiChannelImage outC ((inH - 1) * stride - 2 * padding + kH) ((inW - 1) * stride - 2 * padding + kW) α) :

                                                                                  Gradient of ConvTranspose2D output w.r.t. the bias (sum over spatial positions).

                                                                                  Instances For
                                                                                    def Spec.convTranspose2dInputDerivSpec {α : Type} [Context α] {inC outC kH kW stride padding inH inW : } {_h1 : inC > 0} {_h2 : kH 0} {_h3 : kW 0} (weights : ConvTransposeKernel outC inC kH kW α) (grad_output : MultiChannelImage outC ((inH - 1) * stride - 2 * padding + kH) ((inW - 1) * stride - 2 * padding + kW) α) :
                                                                                    MultiChannelImage inC inH inW α

                                                                                    Gradient of ConvTranspose2D output w.r.t. the input image.

                                                                                    Instances For
                                                                                      def Spec.convTranspose2dBackwardSpec {α : Type} [Context α] {inC outC kH kW stride padding inH inW : } {h1 : inC > 0} {h2 : kH 0} {h3 : kW 0} (layer : ConvTranspose2DSpec inC outC kH kW stride padding α h1 h2 h3) (input : MultiChannelImage inC inH inW α) (grad_output : MultiChannelImage outC ((inH - 1) * stride - 2 * padding + kH) ((inW - 1) * stride - 2 * padding + kW) α) :
                                                                                      ConvTransposeKernel outC inC kH kW α × Tensor α (Shape.dim outC Shape.scalar) × MultiChannelImage inC inH inW α

                                                                                      ConvTranspose2D backward pass: returns (dKernel, dBias, dInput).

                                                                                      Instances For

                                                                                        Friendly aliases #

                                                                                        We keep the _spec suffixes as the canonical names, but provide a few shorthands so call sites can read more like PyTorch-style pseudocode.

                                                                                        @[reducible, inline]
                                                                                        abbrev Spec.conv {α : Type} [Context α] {d inC outC : } {kernel stride padding inSpatial : Vector d} (layer : ConvSpec d inC outC kernel stride padding α) (input : Tensor α (Shape.ofList (inC :: inSpatial.toList))) :
                                                                                        Tensor α (Shape.ofList (outC :: (convOutSpatial inSpatial kernel stride padding).toList))

                                                                                        Alias for conv_spec.

                                                                                        Instances For
                                                                                          @[reducible, inline]
                                                                                          abbrev Spec.convBackward {α : Type} [Context α] {d inC outC : } {kernel stride padding inSpatial : Vector d} (layer : ConvSpec d inC outC kernel stride padding α) (input : Tensor α (Shape.ofList (inC :: inSpatial.toList))) (grad_output : Tensor α (Shape.ofList (outC :: (convOutSpatial inSpatial kernel stride padding).toList))) :
                                                                                          Tensor α (Shape.ofList (outC :: inC :: kernel.toList)) × Tensor α (Shape.dim outC Shape.scalar) × Tensor α (Shape.ofList (inC :: inSpatial.toList))

                                                                                          Alias for conv_backward_spec.

                                                                                          Instances For
                                                                                            @[reducible, inline]
                                                                                            abbrev Spec.convKernelDeriv {α : Type} [Context α] {d inC outC : } {kernel stride padding inSpatial : Vector d} (_layer : ConvSpec d inC outC kernel stride padding α) (input : Tensor α (Shape.ofList (inC :: inSpatial.toList))) (grad_output : Tensor α (Shape.ofList (outC :: (convOutSpatial inSpatial kernel stride padding).toList))) :
                                                                                            Tensor α (Shape.ofList (outC :: inC :: kernel.toList))

                                                                                            Alias for conv_kernel_deriv_spec.

                                                                                            Instances For
                                                                                              @[reducible, inline]
                                                                                              abbrev Spec.convBiasDeriv {α : Type} [Context α] {d inC outC : } {kernel stride padding inSpatial : Vector d} (_layer : ConvSpec d inC outC kernel stride padding α) (_input : Tensor α (Shape.ofList (inC :: inSpatial.toList))) (grad_output : Tensor α (Shape.ofList (outC :: (convOutSpatial inSpatial kernel stride padding).toList))) :

                                                                                              Alias for conv_bias_deriv_spec.

                                                                                              Instances For
                                                                                                @[reducible, inline]
                                                                                                abbrev Spec.convInputDeriv {α : Type} [Context α] {d inC outC : } {kernel stride padding inSpatial : Vector d} (layer : ConvSpec d inC outC kernel stride padding α) (_input : Tensor α (Shape.ofList (inC :: inSpatial.toList))) (grad_output : Tensor α (Shape.ofList (outC :: (convOutSpatial inSpatial kernel stride padding).toList))) :
                                                                                                Tensor α (Shape.ofList (inC :: inSpatial.toList))

                                                                                                Alias for conv_input_deriv_spec.

                                                                                                Instances For