TorchLean API

NN.Spec.Layers.Pooling

Pooling layers (spec layer) #

This file defines a small set of pooling operators in TorchLean's spec layer.

PyTorch analogies:

We also include a smooth log-sum-exp surrogate for max pooling. It is useful when you want an everywhere-differentiable approximation for proofs or analysis, without changing the rest of the pooling API.

structure Spec.MaxPool2DSpec (kH kW stride : ) (h1 : kH 0) (h2 : kW 0) (hStride : stride 0) :

MaxPool2d configuration.

The spec uses a fixed kernel (kH,kW) and a single stride value (applied to both height and width). We require kH ≠ 0, kW ≠ 0, and stride ≠ 0 so windows are nonempty and the output-shape arithmetic is well-defined.

PyTorch analogy: F.max_pool2d(x, kernel_size=(kH,kW), stride=stride).

  • kernelHeight :

    kernel Height.

  • kernelWidth :

    kernel Width.

  • stride :

    Stride.

Instances For
    structure Spec.AvgPool2DSpec (kH kW stride : ) (h1 : kH 0) (h2 : kW 0) (hStride : stride 0) :

    AvgPool2d configuration.

    We treat the pooling window as kH*kW elements and divide by that count. This corresponds to PyTorch's default behavior when no padding is present.

    • kernelHeight :

      kernel Height.

    • kernelWidth :

      kernel Width.

    • stride :

      Stride.

    Instances For
      def Spec.pool2dOutShape (inH inW kH kW stride : ) :

      Output shape for a 2D pooling op (single-channel) with no padding.

      This uses the standard "valid" pooling formula:

      outH = floor((inH - kH)/stride) + 1, outW = floor((inW - kW)/stride) + 1.

      PyTorch analogy: ceil_mode=false with no padding.

      Instances For
        def Spec.pool2dMultiOutShape (inC inH inW kH kW stride : ) :

        Output shape for multi-channel 2D pooling (channels preserved).

        Instances For
          def Spec.pool2dOutShapePad (inH inW kH kW stride padding : ) :

          Output shape for a 2D pooling op (single-channel) with symmetric padding.

          padding means we use the usual PyTorch output-size formula for an input extended by padding cells on each side. Hard max-pooling ignores padded cells (the PyTorch -∞ convention), while average-pooling below explicitly includes padded zeros.

          Instances For
            def Spec.pool2dMultiOutShapePad (inC inH inW kH kW stride padding : ) :

            Output shape for multi-channel 2D pooling with symmetric padding (channels preserved).

            Instances For

              Smooth max pooling #

              max_pool2d_spec uses max and is non-differentiable (ties and kink points).

              For proofs that need everywhere differentiability, we provide a smooth surrogate based on log-sum-exp over each pooling window:

              smooth_max(x₁,…,xₙ) = (1 / β) * log (∑ exp (β * xᵢ))

              This is the standard log-sum-exp surrogate and is intended for β ≠ 0.

              def Spec.smoothMaxPool2dSpec {α : Type} [Context α] {kH kW inH inW stride : } {h1 : kH 0} {h2 : kW 0} {hStride : stride 0} (layer : MaxPool2DSpec kH kW stride h1 h2 hStride) (beta : α) (input : Image inH inW α) :
              Image ((inH - kH) / stride + 1) ((inW - kW) / stride + 1) α

              Smooth max-pooling (single-channel) using a log-sum-exp surrogate.

              This is useful in proof settings that want a differentiable alternative to max_pool2d_spec. For large beta, the output approaches hard max pooling.

              Instances For
                def Spec.smoothMaxPool2dMultiSpec {α : Type} [Context α] {kH kW inH inW inC stride : } {h1 : kH 0} {h2 : kW 0} {hStride : stride 0} (layer : MaxPool2DSpec kH kW stride h1 h2 hStride) (beta : α) (input : MultiChannelImage inC inH inW α) :
                MultiChannelImage inC ((inH - kH) / stride + 1) ((inW - kW) / stride + 1) α

                Smooth max-pooling (multi-channel): apply smooth_max_pool2d_spec per channel.

                Instances For
                  def Spec.smoothMaxPool2dJvpSpec {α : Type} [Context α] {kH kW inH inW stride : } {h1 : kH 0} {h2 : kW 0} {hStride : stride 0} (layer : MaxPool2DSpec kH kW stride h1 h2 hStride) (beta : α) (input tangent : Image inH inW α) :
                  Image ((inH - kH) / stride + 1) ((inW - kW) / stride + 1) α

                  Forward-mode JVP for smooth max-pooling (single-channel).

                  For each pooling window this is the differential of the log-sum-exp surrogate, Σᵢ softmax(beta*xᵢ) * dxᵢ. This mirrors the VJP weights below but pushes an input tangent forward instead of pulling an output cotangent backward.

                  Instances For
                    def Spec.smoothMaxPool2dMultiJvpSpec {α : Type} [Context α] {kH kW inH inW inC stride : } {h1 : kH 0} {h2 : kW 0} {hStride : stride 0} (layer : MaxPool2DSpec kH kW stride h1 h2 hStride) (beta : α) (input tangent : MultiChannelImage inC inH inW α) :
                    MultiChannelImage inC ((inH - kH) / stride + 1) ((inW - kW) / stride + 1) α

                    Multi-channel JVP for smooth max-pooling (channel-wise application).

                    Instances For
                      def Spec.maxPool2dSpec {α : Type} [Context α] {kH kW inH inW stride : } {h1 : kH 0} {h2 : kW 0} {hStride : stride 0} (layer : MaxPool2DSpec kH kW stride h1 h2 hStride) (input : Image inH inW α) :
                      Image ((inH - kH) / stride + 1) ((inW - kW) / stride + 1) α

                      MaxPool2d forward pass (single-channel).

                      This takes the maximum over each kH×kW window sampled with the given stride. The return type encodes the standard output spatial size formula.

                      Instances For
                        def Spec.maxPool2dMultiSpec {α : Type} [Context α] {kH kW inH inW inC stride : } {h1 : kH 0} {h2 : kW 0} {hStride : stride 0} (layer : MaxPool2DSpec kH kW stride h1 h2 hStride) (input : MultiChannelImage inC inH inW α) :
                        MultiChannelImage inC ((inH - kH) / stride + 1) ((inW - kW) / stride + 1) α

                        MaxPool2d forward pass (multi-channel): apply max_pool2d_spec per channel.

                        Instances For
                          def Spec.maxPool2dJvpSpec {α : Type} [Context α] {kH kW inH inW stride : } {h1 : kH 0} {h2 : kW 0} {hStride : stride 0} (layer : MaxPool2DSpec kH kW stride h1 h2 hStride) (input tangent : Image inH inW α) :
                          Image ((inH - kH) / stride + 1) ((inW - kW) / stride + 1) α

                          Forward-mode JVP for hard max-pooling (single-channel).

                          The tangent is read at the argmax chosen by the primal input. At ties the first row-major maximizer is used, matching maxPool2dBackwardSpec and PyTorch's index convention.

                          Instances For
                            def Spec.maxPool2dMultiJvpSpec {α : Type} [Context α] {kH kW inH inW inC stride : } {h1 : kH 0} {h2 : kW 0} {hStride : stride 0} (layer : MaxPool2DSpec kH kW stride h1 h2 hStride) (input tangent : MultiChannelImage inC inH inW α) :
                            MultiChannelImage inC ((inH - kH) / stride + 1) ((inW - kW) / stride + 1) α

                            Multi-channel JVP for hard max-pooling (channel-wise application).

                            Instances For
                              def Spec.avgPool2dSpec {α : Type} [Context α] {kH kW inH inW stride : } {h1 : kH 0} {h2 : kW 0} {hStride : stride 0} (layer : AvgPool2DSpec kH kW stride h1 h2 hStride) (input : Image inH inW α) :
                              Image ((inH - kH) / stride + 1) ((inW - kW) / stride + 1) α

                              AvgPool2d forward pass (single-channel).

                              We sum all values in the window and divide by kH*kW. PyTorch analogy: avg_pool2d with count_include_pad=true only matters for padded pooling; for the unpadded case it matches the usual definition.

                              Instances For
                                def Spec.avgPool2dMultiSpec {α : Type} [Context α] {kH kW inH inW inC stride : } (h1 : kH 0) (h2 : kW 0) {hStride : stride 0} (layer : AvgPool2DSpec kH kW stride h1 h2 hStride) (input : MultiChannelImage inC inH inW α) :
                                MultiChannelImage inC ((inH - kH) / stride + 1) ((inW - kW) / stride + 1) α

                                AvgPool2d forward pass (multi-channel): apply avg_pool2d_spec per channel.

                                Instances For
                                  structure Spec.AdaptiveAvgPool2DSpec (outH outW : ) :

                                  Spec record for adaptive average pooling to a fixed output size.

                                  • outputHeight :

                                    output Height.

                                  • outputWidth :

                                    output Width.

                                  Instances For
                                    structure Spec.AdaptiveMaxPool2DSpec (outH outW : ) :

                                    Spec record for adaptive max pooling to a fixed output size.

                                    • outputHeight :

                                      output Height.

                                    • outputWidth :

                                      output Width.

                                    Instances For

                                      Adaptive pooling #

                                      PyTorch defines adaptive pooling by partitioning the input into out bins. For output index i, the pooling region is:

                                      This matters when in is not divisible by out: region sizes vary by at most 1.

                                      def Spec.adaptiveStart (inSize outSize i : ) :

                                      Adaptive-pooling region start index: floor(i * in / out) (PyTorch definition).

                                      Instances For
                                        def Spec.adaptiveEnd (inSize outSize i : ) :

                                        Adaptive-pooling region end index: ceil((i+1) * in / out) (PyTorch definition).

                                        Instances For
                                          def Spec.adaptiveAvgPool2dSpec {α : Type} [Context α] {inH inW inC : } (outH outW : ) (_layer : AdaptiveAvgPool2DSpec outH outW) (input : MultiChannelImage inC inH inW α) (_h_inH : inH > 0 := by norm_num) (_h_inW : inW > 0 := by norm_num) (_h_outH : outH > 0 := by norm_num) (_h_outW : outW > 0 := by norm_num) :
                                          MultiChannelImage inC outH outW α

                                          AdaptiveAvgPool2d forward pass.

                                          Unlike fixed-kernel pooling, adaptive pooling chooses a window for each output position so that the whole input is covered by outH×outW bins. This follows the PyTorch start/end formula (see the section comment above).

                                          Instances For
                                            def Spec.adaptiveMaxPool2dSpec {α : Type} [Context α] {inH inW inC : } (outH outW : ) (_layer : AdaptiveMaxPool2DSpec outH outW) (input : MultiChannelImage inC inH inW α) (_h_inH : inH > 0 := by norm_num) (_h_inW : inW > 0 := by norm_num) (_h_outH : outH > 0 := by norm_num) (_h_outW : outW > 0 := by norm_num) :
                                            MultiChannelImage inC outH outW α

                                            AdaptiveMaxPool2d forward pass (same binning as adaptive avg, but with max instead of mean).

                                            We intentionally do not use a numeric sentinel value to seed the max fold; we seed from the first element of the region via getValueAtPosition. That keeps the spec meaningful across different scalar backends.

                                            Instances For
                                              def Spec.maxPool2dBackwardSpec {α : Type} [Context α] {kH kW inH inW stride : } {h1 : kH 0} {h2 : kW 0} {hStride : stride 0} (_layer : MaxPool2DSpec kH kW stride h1 h2 hStride) (input : Image inH inW α) (grad_output : Image ((inH - kH) / stride + 1) ((inW - kW) / stride + 1) α) :
                                              Image inH inW α

                                              Backward/VJP for max_pool2d_spec.

                                              This propagates each output gradient to the argmax location inside the corresponding window. Tie-breaking: if multiple values in the window are equal to the maximum, we keep the first position in row-major order (same convention as PyTorch's max-pool indices).

                                              Instances For
                                                def Spec.maxPool2dMultiBackwardSpec {α : Type} [Context α] {kH kW inH inW inC stride : } {h1 : kH 0} {h2 : kW 0} {hStride : stride 0} (layer : MaxPool2DSpec kH kW stride h1 h2 hStride) (input : MultiChannelImage inC inH inW α) (grad_output : MultiChannelImage inC ((inH - kH) / stride + 1) ((inW - kW) / stride + 1) α) :
                                                MultiChannelImage inC inH inW α

                                                Multi-channel max-pooling backward (channel-wise application of max_pool2d_backward_spec).

                                                Instances For
                                                  def Spec.avgPool2dBackwardSpec {α : Type} [Context α] {kH kW inH inW stride : } (_h1 : kH 0) (_h2 : kW 0) {hStride : stride 0} (_layer : AvgPool2DSpec kH kW stride _h1 _h2 hStride) (grad_output : Image ((inH - kH) / stride + 1) ((inW - kW) / stride + 1) α) :
                                                  Image inH inW α

                                                  Backward/VJP for avg_pool2d_spec (single-channel).

                                                  Each output gradient is evenly distributed across its corresponding input window.

                                                  Instances For

                                                    Padded pooling (symmetric padding) #

                                                    For max-pooling, padded locations are not real input elements and are ignored when selecting the maximum. This is the scalar-polymorphic way to model PyTorch's -∞ max-pool padding without adding a backend-specific infinity constant to Context α.

                                                    For average pooling, this corresponds to including padded zeros in the average (PyTorch's default count_include_pad = true).

                                                    def Spec.unpadImage {α : Type} [Context α] {H W padding : } (img : Image (H + 2 * padding) (W + 2 * padding) α) :
                                                    Image H W α

                                                    Remove symmetric zero-padding from a single-channel image.

                                                    Instances For
                                                      def Spec.unpadMultiChannel {α : Type} [Context α] {C H W padding : } (img : MultiChannelImage C (H + 2 * padding) (W + 2 * padding) α) :

                                                      Remove symmetric zero-padding from a multi-channel image (channel-wise unpad_image).

                                                      Instances For
                                                        def Spec.maxPool2dMultiSpecPad {α : Type} [Context α] {kH kW inH inW inC stride padding : } {h1 : kH 0} {h2 : kW 0} {hStride : stride 0} (layer : MaxPool2DSpec kH kW stride h1 h2 hStride) (input : MultiChannelImage inC inH inW α) :
                                                        MultiChannelImage inC ((inH + 2 * padding - kH) / stride + 1) ((inW + 2 * padding - kW) / stride + 1) α

                                                        Multi-channel max-pooling forward pass with PyTorch-style padding (-∞ outside bounds).

                                                        Instances For
                                                          def Spec.maxPool2dMultiJvpSpecPad {α : Type} [Context α] {kH kW inH inW inC stride padding : } {h1 : kH 0} {h2 : kW 0} {hStride : stride 0} (layer : MaxPool2DSpec kH kW stride h1 h2 hStride) (input tangent : MultiChannelImage inC inH inW α) :
                                                          MultiChannelImage inC ((inH + 2 * padding - kH) / stride + 1) ((inW + 2 * padding - kW) / stride + 1) α

                                                          Forward-mode JVP for padded hard max-pooling.

                                                          Padding cells are ignored exactly as in maxPool2dMultiSpecPad, so the tangent is taken from the primal winner among real input locations only. If a window contains no real input cells, the forward value and tangent are both 0.

                                                          Instances For
                                                            def Spec.avgPool2dMultiSpecPad {α : Type} [Context α] {kH kW inH inW inC stride padding : } (h1 : kH 0) (h2 : kW 0) {hStride : stride 0} (layer : AvgPool2DSpec kH kW stride h1 h2 hStride) (input : MultiChannelImage inC inH inW α) :
                                                            MultiChannelImage inC ((inH + 2 * padding - kH) / stride + 1) ((inW + 2 * padding - kW) / stride + 1) α

                                                            Multi-channel average pooling forward pass with symmetric zero padding.

                                                            Instances For
                                                              def Spec.maxPool2dMultiBackwardSpecPad {α : Type} [Context α] {kH kW inH inW inC stride padding : } {h1 : kH 0} {h2 : kW 0} {hStride : stride 0} (layer : MaxPool2DSpec kH kW stride h1 h2 hStride) (input : MultiChannelImage inC inH inW α) (grad_output : MultiChannelImage inC ((inH + 2 * padding - kH) / stride + 1) ((inW + 2 * padding - kW) / stride + 1) α) :
                                                              MultiChannelImage inC inH inW α

                                                              Multi-channel max-pooling backward pass with PyTorch-style padding (-∞ outside bounds).

                                                              Instances For
                                                                def Spec.avgPool2dMultiBackwardSpecPad {α : Type} [Context α] {kH kW inH inW inC stride padding : } (h1 : kH 0) (h2 : kW 0) {hStride : stride 0} (layer : AvgPool2DSpec kH kW stride h1 h2 hStride) (grad_output : MultiChannelImage inC ((inH + 2 * padding - kH) / stride + 1) ((inW + 2 * padding - kW) / stride + 1) α) :
                                                                MultiChannelImage inC inH inW α

                                                                Multi-channel average-pooling backward pass with symmetric padding (backprop then unpad).

                                                                Instances For
                                                                  def Spec.smoothMaxPool2dBackwardSpec {α : Type} [Context α] {kH kW inH inW stride : } {h1 : kH 0} {h2 : kW 0} {hStride : stride 0} (_layer : MaxPool2DSpec kH kW stride h1 h2 hStride) (beta : α) (input : Image inH inW α) (grad_output : Image ((inH - kH) / stride + 1) ((inW - kW) / stride + 1) α) :
                                                                  Image inH inW α

                                                                  Backward/VJP for smooth_max_pool2d_spec (log-sum-exp surrogate).

                                                                  Instances For
                                                                    def Spec.smoothMaxPool2dMultiBackwardSpec {α : Type} [Context α] {kH kW inH inW inC stride : } {h1 : kH 0} {h2 : kW 0} {hStride : stride 0} (layer : MaxPool2DSpec kH kW stride h1 h2 hStride) (beta : α) (input : MultiChannelImage inC inH inW α) (grad_output : MultiChannelImage inC ((inH - kH) / stride + 1) ((inW - kW) / stride + 1) α) :
                                                                    MultiChannelImage inC inH inW α

                                                                    Multi-channel backward for smooth_max_pool2d_multi_spec (apply per channel).

                                                                    Instances For

                                                                      Generic N-D pooling (channels-first, no batch) #

                                                                      These operators generalize the existing 2D pooling specs to an arbitrary spatial rank d.

                                                                      Conventions:

                                                                      PyTorch comparisons (conceptual, without batch axis):

                                                                      Layer configs + output shapes #

                                                                      structure Spec.MaxPoolSpec (d : ) (kernel stride padding : Vector d) (hKernel : ∀ (i : Fin d), kernel.get i 0) (hStride : ∀ (i : Fin d), stride.get i 0) :

                                                                      Kernel/stride/padding configuration for N-D max pooling.

                                                                      • kernelSizes : Vector d

                                                                        Kernel sizes per spatial axis (outermost to innermost).

                                                                      • strideSizes : Vector d

                                                                        Strides per spatial axis (outermost to innermost).

                                                                      • paddingSizes : Vector d

                                                                        Symmetric zero padding per spatial axis (outermost to innermost).

                                                                      Instances For
                                                                        structure Spec.AvgPoolSpec (d : ) (kernel stride padding : Vector d) (hKernel : ∀ (i : Fin d), kernel.get i 0) (hStride : ∀ (i : Fin d), stride.get i 0) :

                                                                        Kernel/stride/padding configuration for N-D average pooling.

                                                                        • kernelSizes : Vector d

                                                                          Kernel sizes per spatial axis (outermost to innermost).

                                                                        • strideSizes : Vector d

                                                                          Strides per spatial axis (outermost to innermost).

                                                                        • paddingSizes : Vector d

                                                                          Symmetric zero padding per spatial axis (outermost to innermost).

                                                                        Instances For
                                                                          def Spec.poolOutSpatial {d : } (inSpatial kernel stride : Vector d) :

                                                                          "Valid" output spatial sizes (no padding): out = (in - k) / stride + 1 per axis.

                                                                          Instances For
                                                                            def Spec.poolOutSpatialPad {d : } (inSpatial kernel stride padding : Vector d) :

                                                                            Padded output spatial sizes: out = (in + 2*pad - k) / stride + 1 per axis.

                                                                            Instances For
                                                                              def Spec.poolOutShape {d : } (inSpatial kernel stride : Vector d) :

                                                                              Output shape for single-channel N-D pooling (no padding).

                                                                              Instances For
                                                                                def Spec.poolMultiOutShape {d : } (inC : ) (inSpatial kernel stride : Vector d) :

                                                                                Output shape for channels-first N-D pooling (no padding; channels preserved).

                                                                                Instances For
                                                                                  def Spec.poolOutShapePad {d : } (inSpatial kernel stride padding : Vector d) :

                                                                                  Output shape for single-channel N-D pooling with symmetric padding.

                                                                                  Instances For
                                                                                    def Spec.poolMultiOutShapePad {d : } (inC : ) (inSpatial kernel stride padding : Vector d) :

                                                                                    Output shape for channels-first N-D pooling with symmetric padding (channels preserved).

                                                                                    Instances For
                                                                                      def Spec.Private.tensorOfDims {α : Type} (dims : List ) (f : List α) :
                                                                                      Instances For
                                                                                        def Spec.Private.foldlIndices' {β : Type} (dims : List ) (init : β) (f : βList β) :
                                                                                        β
                                                                                        Instances For
                                                                                          def Spec.Private.paddedCoords? (outIdxs winIdxs stride : List ) :
                                                                                          Instances For
                                                                                            def Spec.Private.unpadCoords? (padded padding : List ) :
                                                                                            Instances For
                                                                                              Instances For
                                                                                                def Spec.Private.getPaddedAverageInputVal {α : Type} [Context α] {d : } {inSpatial : Vector d} (input : Tensor α (Shape.ofList inSpatial.toList)) (outIdxs winIdxs stride padding : List ) :
                                                                                                α

                                                                                                Input lookup for average/smooth pooling.

                                                                                                For average-style pooling, padded cells contribute numeric zero and are still counted by the denominator chosen by the surrounding pooling spec. We keep this separate from getPaddedMaxInputVal?, where padded cells must be ignored rather than treated as zero.

                                                                                                Instances For
                                                                                                  def Spec.Private.getPaddedMaxInputVal? {α : Type} [Context α] {d : } {inSpatial : Vector d} (input : Tensor α (Shape.ofList inSpatial.toList)) (outIdxs winIdxs stride padding : List ) :

                                                                                                  Input lookup for hard max-pooling.

                                                                                                  Unlike average-pooling, max-pooling should not insert a numeric zero for padded cells: PyTorch's max-pool semantics treat padding as -∞. TorchLean keeps the spec scalar-polymorphic by returning none for padded coordinates and letting the max fold ignore them.

                                                                                                  Instances For
                                                                                                    Instances For
                                                                                                      def Spec.Private.maxPoolValue {α : Type} [Context α] {d : } {inSpatial : Vector d} (input : Tensor α (Shape.ofList inSpatial.toList)) (outIdxs kernel stride padding : List ) :
                                                                                                      α
                                                                                                      Instances For
                                                                                                        def Spec.Private.maxPoolJvpValue {α : Type} [Context α] {d : } {inSpatial : Vector d} (input tangent : Tensor α (Shape.ofList inSpatial.toList)) (outIdxs kernel stride padding : List ) :
                                                                                                        α

                                                                                                        Directional derivative of hard max-pooling for one N-D window.

                                                                                                        The derivative is taken along the same winner selected by maxPoolValue. At ties we keep the first winner in row-major order, matching the VJP convention below and PyTorch's index convention.

                                                                                                        Instances For
                                                                                                          def Spec.Private.avgPoolValue {α : Type} [Context α] {d : } {inSpatial : Vector d} (input : Tensor α (Shape.ofList inSpatial.toList)) (outIdxs kernel stride padding : List ) :
                                                                                                          α
                                                                                                          Instances For
                                                                                                            def Spec.Private.smoothMaxPoolValue {α : Type} [Context α] {d : } {inSpatial : Vector d} (beta : α) (input : Tensor α (Shape.ofList inSpatial.toList)) (outIdxs kernel stride padding : List ) :
                                                                                                            α
                                                                                                            Instances For
                                                                                                              def Spec.Private.smoothMaxPoolJvpValue {α : Type} [Context α] {d : } {inSpatial : Vector d} (beta : α) (input tangent : Tensor α (Shape.ofList inSpatial.toList)) (outIdxs kernel stride padding : List ) :
                                                                                                              α

                                                                                                              Directional derivative of the smooth log-sum-exp pooling value.

                                                                                                              For y = beta⁻¹ log Σ exp(beta*xᵢ), the directional derivative is Σ softmax(beta*xᵢ) * dxᵢ, using the same zero-padding convention as smoothMaxPoolValue.

                                                                                                              Instances For

                                                                                                                Forward (single-channel spatial tensor) #

                                                                                                                def Spec.maxPoolSpatialSpec {α : Type} [Context α] {d : } {inSpatial kernel stride padding : Vector d} {hKernel : ∀ (i : Fin d), kernel.get i 0} {hStride : ∀ (i : Fin d), stride.get i 0} (_layer : MaxPoolSpec d kernel stride padding hKernel hStride) (input : Tensor α (Shape.ofList inSpatial.toList)) :
                                                                                                                Tensor α (Shape.ofList (poolOutSpatialPad inSpatial kernel stride padding).toList)

                                                                                                                N-D max pooling on a spatial tensor (no explicit channel axis).

                                                                                                                Instances For
                                                                                                                  def Spec.maxPoolSpatialJvpSpec {α : Type} [Context α] {d : } {inSpatial kernel stride padding : Vector d} {hKernel : ∀ (i : Fin d), kernel.get i 0} {hStride : ∀ (i : Fin d), stride.get i 0} (_layer : MaxPoolSpec d kernel stride padding hKernel hStride) (input tangent : Tensor α (Shape.ofList inSpatial.toList)) :
                                                                                                                  Tensor α (Shape.ofList (poolOutSpatialPad inSpatial kernel stride padding).toList)

                                                                                                                  Forward-mode JVP for N-D hard max-pooling on a spatial tensor.

                                                                                                                  The derivative follows the same primal argmax as maxPoolSpatialSpec; at ties it keeps the first row-major maximizer. This is the correct directional derivative for TorchLean's chosen subgradient convention and matches the VJP tie policy.

                                                                                                                  Instances For
                                                                                                                    def Spec.avgPoolSpatialSpec {α : Type} [Context α] {d : } {inSpatial kernel stride padding : Vector d} {hKernel : ∀ (i : Fin d), kernel.get i 0} {hStride : ∀ (i : Fin d), stride.get i 0} (_layer : AvgPoolSpec d kernel stride padding hKernel hStride) (input : Tensor α (Shape.ofList inSpatial.toList)) :
                                                                                                                    Tensor α (Shape.ofList (poolOutSpatialPad inSpatial kernel stride padding).toList)

                                                                                                                    N-D average pooling on a spatial tensor (no explicit channel axis).

                                                                                                                    Instances For

                                                                                                                      Backward (single-channel spatial tensor) #

                                                                                                                      These are the VJPs of the forward pooling specs above.

                                                                                                                      Conventions:

                                                                                                                      def Spec.maxPoolSpatialBackwardSpec {α : Type} [Context α] {d : } {inSpatial kernel stride padding : Vector d} {hKernel : ∀ (i : Fin d), kernel.get i 0} {hStride : ∀ (i : Fin d), stride.get i 0} (layer : MaxPoolSpec d kernel stride padding hKernel hStride) (input : Tensor α (Shape.ofList inSpatial.toList)) (grad_output : Tensor α (Shape.ofList (poolOutSpatialPad inSpatial kernel stride padding).toList)) :
                                                                                                                      Tensor α (Shape.ofList inSpatial.toList)

                                                                                                                      Backward/VJP for max_pool_spatial_spec.

                                                                                                                      Each output gradient is propagated to the argmax location in the corresponding input window. Ties keep the first position in row-major order.

                                                                                                                      Instances For
                                                                                                                        def Spec.avgPoolSpatialBackwardSpec {α : Type} [Context α] {d : } {inSpatial kernel stride padding : Vector d} {hKernel : ∀ (i : Fin d), kernel.get i 0} {hStride : ∀ (i : Fin d), stride.get i 0} (_layer : AvgPoolSpec d kernel stride padding hKernel hStride) (grad_output : Tensor α (Shape.ofList (poolOutSpatialPad inSpatial kernel stride padding).toList)) :
                                                                                                                        Tensor α (Shape.ofList inSpatial.toList)

                                                                                                                        Backward/VJP for avg_pool_spatial_spec (single-channel).

                                                                                                                        Each output gradient is evenly distributed across its kernel window.

                                                                                                                        Instances For

                                                                                                                          Forward (channels-first: C × spatial...) #

                                                                                                                          def Spec.maxPoolSpec {α : Type} [Context α] {d C : } {inSpatial kernel stride padding : Vector d} {hKernel : ∀ (i : Fin d), kernel.get i 0} {hStride : ∀ (i : Fin d), stride.get i 0} (layer : MaxPoolSpec d kernel stride padding hKernel hStride) (input : Tensor α (Shape.ofList (C :: inSpatial.toList))) :
                                                                                                                          Tensor α (Shape.ofList (C :: (poolOutSpatialPad inSpatial kernel stride padding).toList))

                                                                                                                          N-D max pooling on a channels-first tensor: shape [C] ++ spatial.

                                                                                                                          Instances For
                                                                                                                            def Spec.maxPoolJvpSpec {α : Type} [Context α] {d C : } {inSpatial kernel stride padding : Vector d} {hKernel : ∀ (i : Fin d), kernel.get i 0} {hStride : ∀ (i : Fin d), stride.get i 0} (layer : MaxPoolSpec d kernel stride padding hKernel hStride) (input tangent : Tensor α (Shape.ofList (C :: inSpatial.toList))) :
                                                                                                                            Tensor α (Shape.ofList (C :: (poolOutSpatialPad inSpatial kernel stride padding).toList))

                                                                                                                            N-D hard max-pool JVP on a channels-first tensor (channel-wise application).

                                                                                                                            Instances For
                                                                                                                              def Spec.avgPoolSpec {α : Type} [Context α] {d C : } {inSpatial kernel stride padding : Vector d} {hKernel : ∀ (i : Fin d), kernel.get i 0} {hStride : ∀ (i : Fin d), stride.get i 0} (layer : AvgPoolSpec d kernel stride padding hKernel hStride) (input : Tensor α (Shape.ofList (C :: inSpatial.toList))) :
                                                                                                                              Tensor α (Shape.ofList (C :: (poolOutSpatialPad inSpatial kernel stride padding).toList))

                                                                                                                              N-D average pooling on a channels-first tensor: shape [C] ++ spatial.

                                                                                                                              Instances For

                                                                                                                                Backward (channels-first: C × spatial...) #

                                                                                                                                def Spec.maxPoolBackwardSpec {α : Type} [Context α] {d C : } {inSpatial kernel stride padding : Vector d} {hKernel : ∀ (i : Fin d), kernel.get i 0} {hStride : ∀ (i : Fin d), stride.get i 0} (layer : MaxPoolSpec d kernel stride padding hKernel hStride) (input : Tensor α (Shape.ofList (C :: inSpatial.toList))) (grad_output : Tensor α (Shape.ofList (C :: (poolOutSpatialPad inSpatial kernel stride padding).toList))) :
                                                                                                                                Tensor α (Shape.ofList (C :: inSpatial.toList))

                                                                                                                                Multi-channel VJP for max_pool_spec (apply spatial backward per channel).

                                                                                                                                Instances For
                                                                                                                                  def Spec.avgPoolBackwardSpec {α : Type} [Context α] {d C : } {inSpatial kernel stride padding : Vector d} {hKernel : ∀ (i : Fin d), kernel.get i 0} {hStride : ∀ (i : Fin d), stride.get i 0} (layer : AvgPoolSpec d kernel stride padding hKernel hStride) (grad_output : Tensor α (Shape.ofList (C :: (poolOutSpatialPad inSpatial kernel stride padding).toList))) :
                                                                                                                                  Tensor α (Shape.ofList (C :: inSpatial.toList))

                                                                                                                                  Multi-channel VJP for avg_pool_spec (apply spatial backward per channel).

                                                                                                                                  Instances For

                                                                                                                                    Smooth max pooling (log-sum-exp surrogate) #

                                                                                                                                    def Spec.smoothMaxPoolSpatialSpec {α : Type} [Context α] {d : } {inSpatial kernel stride padding : Vector d} {hKernel : ∀ (i : Fin d), kernel.get i 0} {hStride : ∀ (i : Fin d), stride.get i 0} (_layer : MaxPoolSpec d kernel stride padding hKernel hStride) (beta : α) (input : Tensor α (Shape.ofList inSpatial.toList)) :
                                                                                                                                    Tensor α (Shape.ofList (poolOutSpatialPad inSpatial kernel stride padding).toList)

                                                                                                                                    Smooth log-sum-exp max pooling on a spatial tensor (no explicit channel axis).

                                                                                                                                    Instances For
                                                                                                                                      def Spec.smoothMaxPoolSpatialJvpSpec {α : Type} [Context α] {d : } {inSpatial kernel stride padding : Vector d} {hKernel : ∀ (i : Fin d), kernel.get i 0} {hStride : ∀ (i : Fin d), stride.get i 0} (_layer : MaxPoolSpec d kernel stride padding hKernel hStride) (beta : α) (input tangent : Tensor α (Shape.ofList inSpatial.toList)) :
                                                                                                                                      Tensor α (Shape.ofList (poolOutSpatialPad inSpatial kernel stride padding).toList)

                                                                                                                                      Forward-mode JVP for N-D smooth max-pooling on a spatial tensor.

                                                                                                                                      For the log-sum-exp surrogate this is the softmax-weighted sum of the input tangent over each window. It is the forward-mode counterpart of smoothMaxPoolSpatialBackwardSpec.

                                                                                                                                      Instances For
                                                                                                                                        def Spec.smoothMaxPoolSpec {α : Type} [Context α] {d C : } {inSpatial kernel stride padding : Vector d} {hKernel : ∀ (i : Fin d), kernel.get i 0} {hStride : ∀ (i : Fin d), stride.get i 0} (layer : MaxPoolSpec d kernel stride padding hKernel hStride) (beta : α) (input : Tensor α (Shape.ofList (C :: inSpatial.toList))) :
                                                                                                                                        Tensor α (Shape.ofList (C :: (poolOutSpatialPad inSpatial kernel stride padding).toList))

                                                                                                                                        Smooth log-sum-exp max pooling on a channels-first tensor (channel-wise application).

                                                                                                                                        Instances For
                                                                                                                                          def Spec.smoothMaxPoolJvpSpec {α : Type} [Context α] {d C : } {inSpatial kernel stride padding : Vector d} {hKernel : ∀ (i : Fin d), kernel.get i 0} {hStride : ∀ (i : Fin d), stride.get i 0} (layer : MaxPoolSpec d kernel stride padding hKernel hStride) (beta : α) (input tangent : Tensor α (Shape.ofList (C :: inSpatial.toList))) :
                                                                                                                                          Tensor α (Shape.ofList (C :: (poolOutSpatialPad inSpatial kernel stride padding).toList))

                                                                                                                                          N-D smooth max-pool JVP on a channels-first tensor (channel-wise application).

                                                                                                                                          Instances For

                                                                                                                                            Smooth max pooling backward #

                                                                                                                                            def Spec.smoothMaxPoolSpatialBackwardSpec {α : Type} [Context α] {d : } {inSpatial kernel stride padding : Vector d} {hKernel : ∀ (i : Fin d), kernel.get i 0} {hStride : ∀ (i : Fin d), stride.get i 0} (_layer : MaxPoolSpec d kernel stride padding hKernel hStride) (beta : α) (input : Tensor α (Shape.ofList inSpatial.toList)) (grad_output : Tensor α (Shape.ofList (poolOutSpatialPad inSpatial kernel stride padding).toList)) :
                                                                                                                                            Tensor α (Shape.ofList inSpatial.toList)

                                                                                                                                            Backward/VJP for smooth_max_pool_spatial_spec (log-sum-exp surrogate).

                                                                                                                                            For a window x₁,…,xₙ, the surrogate is:

                                                                                                                                            y = (1/beta) * log(∑ exp(beta*xᵢ))

                                                                                                                                            and the VJP distributes upstream gradient proportionally to exp(beta*xᵢ).

                                                                                                                                            Instances For
                                                                                                                                              def Spec.smoothMaxPoolBackwardSpec {α : Type} [Context α] {d C : } {inSpatial kernel stride padding : Vector d} {hKernel : ∀ (i : Fin d), kernel.get i 0} {hStride : ∀ (i : Fin d), stride.get i 0} (layer : MaxPoolSpec d kernel stride padding hKernel hStride) (beta : α) (input : Tensor α (Shape.ofList (C :: inSpatial.toList))) (grad_output : Tensor α (Shape.ofList (C :: (poolOutSpatialPad inSpatial kernel stride padding).toList))) :
                                                                                                                                              Tensor α (Shape.ofList (C :: inSpatial.toList))

                                                                                                                                              Multi-channel VJP for smooth_max_pool_spec (apply spatial backward per channel).

                                                                                                                                              Instances For

                                                                                                                                                Friendly aliases #

                                                                                                                                                @[reducible, inline]
                                                                                                                                                abbrev Spec.maxPool {α : Type} [Context α] {d C : } {inSpatial kernel stride padding : Vector d} {hKernel : ∀ (i : Fin d), kernel.get i 0} {hStride : ∀ (i : Fin d), stride.get i 0} (layer : MaxPoolSpec d kernel stride padding hKernel hStride) (input : Tensor α (Shape.ofList (C :: inSpatial.toList))) :
                                                                                                                                                Tensor α (Shape.ofList (C :: (poolOutSpatialPad inSpatial kernel stride padding).toList))

                                                                                                                                                Alias for max_pool_spec.

                                                                                                                                                Instances For
                                                                                                                                                  @[reducible, inline]
                                                                                                                                                  abbrev Spec.avgPool {α : Type} [Context α] {d C : } {inSpatial kernel stride padding : Vector d} {hKernel : ∀ (i : Fin d), kernel.get i 0} {hStride : ∀ (i : Fin d), stride.get i 0} (layer : AvgPoolSpec d kernel stride padding hKernel hStride) (input : Tensor α (Shape.ofList (C :: inSpatial.toList))) :
                                                                                                                                                  Tensor α (Shape.ofList (C :: (poolOutSpatialPad inSpatial kernel stride padding).toList))

                                                                                                                                                  Alias for avg_pool_spec.

                                                                                                                                                  Instances For
                                                                                                                                                    @[reducible, inline]
                                                                                                                                                    abbrev Spec.smoothMaxPool {α : Type} [Context α] {d C : } {inSpatial kernel stride padding : Vector d} {hKernel : ∀ (i : Fin d), kernel.get i 0} {hStride : ∀ (i : Fin d), stride.get i 0} (layer : MaxPoolSpec d kernel stride padding hKernel hStride) (beta : α) (input : Tensor α (Shape.ofList (C :: inSpatial.toList))) :
                                                                                                                                                    Tensor α (Shape.ofList (C :: (poolOutSpatialPad inSpatial kernel stride padding).toList))

                                                                                                                                                    Alias for smooth_max_pool_spec.

                                                                                                                                                    Instances For
                                                                                                                                                      @[reducible, inline]
                                                                                                                                                      abbrev Spec.maxPoolBackward {α : Type} [Context α] {d C : } {inSpatial kernel stride padding : Vector d} {hKernel : ∀ (i : Fin d), kernel.get i 0} {hStride : ∀ (i : Fin d), stride.get i 0} (layer : MaxPoolSpec d kernel stride padding hKernel hStride) (input : Tensor α (Shape.ofList (C :: inSpatial.toList))) (grad_output : Tensor α (Shape.ofList (C :: (poolOutSpatialPad inSpatial kernel stride padding).toList))) :
                                                                                                                                                      Tensor α (Shape.ofList (C :: inSpatial.toList))

                                                                                                                                                      Alias for max_pool_backward_spec.

                                                                                                                                                      Instances For
                                                                                                                                                        @[reducible, inline]
                                                                                                                                                        abbrev Spec.avgPoolBackward {α : Type} [Context α] {d C : } {inSpatial kernel stride padding : Vector d} {hKernel : ∀ (i : Fin d), kernel.get i 0} {hStride : ∀ (i : Fin d), stride.get i 0} (layer : AvgPoolSpec d kernel stride padding hKernel hStride) (grad_output : Tensor α (Shape.ofList (C :: (poolOutSpatialPad inSpatial kernel stride padding).toList))) :
                                                                                                                                                        Tensor α (Shape.ofList (C :: inSpatial.toList))

                                                                                                                                                        Alias for avg_pool_backward_spec.

                                                                                                                                                        Instances For
                                                                                                                                                          @[reducible, inline]
                                                                                                                                                          abbrev Spec.smoothMaxPoolBackward {α : Type} [Context α] {d C : } {inSpatial kernel stride padding : Vector d} {hKernel : ∀ (i : Fin d), kernel.get i 0} {hStride : ∀ (i : Fin d), stride.get i 0} (layer : MaxPoolSpec d kernel stride padding hKernel hStride) (beta : α) (input : Tensor α (Shape.ofList (C :: inSpatial.toList))) (grad_output : Tensor α (Shape.ofList (C :: (poolOutSpatialPad inSpatial kernel stride padding).toList))) :
                                                                                                                                                          Tensor α (Shape.ofList (C :: inSpatial.toList))

                                                                                                                                                          Alias for smooth_max_pool_backward_spec.

                                                                                                                                                          Instances For