TorchLean API

NN.Spec.Layers.Utils

Image/tensor utilities (spec layer) #

Convenience aliases and helpers for 2‑D images (H×W) and multi‑channel images (C×H×W), plus padding and window‑extraction utilities used by conv/pooling layers.

@[reducible, inline]
abbrev Spec.Image (H W : ) (α : Type) :

A 2-D image tensor of shape [H, W].

Instances For
    @[reducible, inline]
    abbrev Spec.MultiChannelImage (C H W : ) (α : Type) :

    A C-channel image tensor of shape [C, H, W] (channels-first, like PyTorch NCHW without N).

    Instances For
      @[reducible, inline]
      abbrev Spec.Signal (L : ) (α : Type) :

      A 1-D signal tensor of shape [L].

      Instances For
        @[reducible, inline]
        abbrev Spec.MultiChannelSignal (C L : ) (α : Type) :

        A C-channel 1-D signal tensor of shape [C, L] (channels-first).

        Instances For
          @[reducible, inline]
          abbrev Spec.Volume (D H W : ) (α : Type) :

          A 3-D volume tensor of shape [D, H, W].

          Instances For
            @[reducible, inline]
            abbrev Spec.MultiChannelVolume (C D H W : ) (α : Type) :

            A C-channel 3-D volume tensor of shape [C, D, H, W] (channels-first).

            Instances For
              def Spec.rwMultiChannelImage {α : Type} {C1 C2 H1 H2 W1 W2 : } (img : MultiChannelImage C1 H1 W1 α) (h1 : C1 = C2) (h2 : H1 = H2) (h3 : W1 = W2) :
              MultiChannelImage C2 H2 W2 α

              Cast a MultiChannelImage along definitional equalities of its channel/height/width indices.

              This is a dependent-type convenience: it does not change the underlying tensor data, only the type-level shape indices.

              Instances For
                def Spec.rwMultiChannelImageExplicit {α : Type} {C1 H1 W1 : } (C2 H2 W2 : ) (img : MultiChannelImage C1 H1 W1 α) (h1 : C1 = C2) (h2 : H1 = H2) (h3 : W1 = W2) :
                MultiChannelImage C2 H2 W2 α

                Explicit-argument version of rw_multi_channel_image.

                This is occasionally convenient when elaboration has trouble inferring C2/H2/W2 from context.

                Instances For
                  def Spec.getValueAtPosition {α : Type} [Context α] {H W : } (img : Image H W α) (x y : ) :

                  Read pixel (x, y) from an Image, returning 0 when out of bounds.

                  This helper is used by window-extraction and padding utilities for conv/pooling specs.

                  Instances For
                    theorem Spec.get_at_or_zero_getValueAtPosition {α : Type} [Context α] {H W : } (img : Image H W α) (x y : ) :

                    getValueAtPosition agrees with the generic list-indexing helper get_at_or_zero.

                    In particular, reading a scalar via the specialized (x, y) accessor is the same as reading with indices [x, y], where both return 0 out of bounds.

                    def Spec.extractWindow {α : Type} [Context α] {H W : } (kW kH : ) (img : Image H W α) (start_i start_j : ) :

                    Extract a kH × kW patch from an image starting at (start_i, start_j).

                    Out-of-bounds pixels are treated as 0, matching the behavior of getValueAtPosition. This is spec-level "im2col"-style logic (cf. PyTorch nn.Unfold, conceptually).

                    Instances For
                      def Spec.padMultiChannel {α : Type} [Context α] {inC inH inW : } (img : MultiChannelImage inC inH inW α) (padding : ) :
                      MultiChannelImage inC (inH + 2 * padding) (inW + 2 * padding) α

                      Zero-pad a channels-first image by padding pixels on each spatial axis.

                      This is the spec analogue of torch.nn.functional.pad (with constant 0 padding). The output shape is [inC, inH + 2*padding, inW + 2*padding].

                      Instances For
                        theorem Spec.get_at_or_zero_pad_multi_channel {α : Type} [Context α] {inC inH inW padding : } (img : MultiChannelImage inC inH inW α) (c : Fin inC) (p q : ) :
                        getAtOrZero (padMultiChannel img padding) [c, p, q] = if _h : p < padding q < padding then 0 else getAtOrZero img [c, p - padding, q - padding]

                        Characterization lemma for pad_multi_channel under list-indexing (get_at_or_zero).

                        Reading the padded tensor at [c, p, q] yields 0 in the top/left padding region, and otherwise reads the original tensor at [c, p - padding, q - padding] (with out-of-bounds falling back to 0 on both sides).

                        theorem Spec.get_at_or_zero_pad_multi_channel_shift {α : Type} [Context α] {inC inH inW padding : } (img : MultiChannelImage inC inH inW α) (c : Fin inC) (i : Fin inH) (j : Fin inW) :
                        getAtOrZero (padMultiChannel img padding) [c, i + padding, j + padding] = getAtOrZero img [c, i, j]

                        Index-shift lemma for pad_multi_channel.

                        If (i, j) is in-bounds for the original image, then reading the padded image at (i + padding, j + padding) returns the same value.

                        def Spec.extractMultiWindow {α : Type} [Context α] {inC kH kW inH inW padding : } (img : MultiChannelImage inC (inH + 2 * padding) (inW + 2 * padding) α) (start_i start_j : ) :

                        Extract a kH × kW window from each channel of a channels-first image.

                        The input is typically a padded image, and the result has shape [inC, kH, kW].

                        Instances For
                          def Spec.padChannelsZero {α : Type} [Zero α] {inChannels outChannels height width : } (_h : inChannels outChannels) (img : MultiChannelImage inChannels height width α) :
                          MultiChannelImage outChannels height width α

                          Increase the channel dimension by zero-padding extra channels.

                          This is used in some ResNet-style skip connections when inChannels < outChannels. Existing channels are copied; newly introduced channels are identically zero.

                          Instances For
                            def Spec.channelIdentity {α : Type} {channels height width : } (img : MultiChannelImage channels height width α) :
                            MultiChannelImage channels height width α

                            Identity on MultiChannelImage (useful as a "no-op" branch in higher-level specs).

                            Instances For
                              def Spec.setValueAtPosition {α : Type} {H W : } (img : Image H W α) (x y : ) (value : α) :
                              Image H W α

                              Write a value at pixel (x, y) if it is in-bounds; otherwise return the original image.

                              This uses update_tensor_spec under the hood and is intended for small spec-level utilities.

                              Instances For
                                def Spec.addValueAtPosition {α : Type} [Add α] {H W : } (img : Image H W α) (x y : ) (value : α) :
                                Image H W α

                                Add value to pixel (x, y) if it is in-bounds; otherwise return the original image.

                                This is a small helper for accumulation-style specs (e.g. naive convolution).

                                Instances For
                                  def Spec.createZeroImage {α : Type} [Zero α] (H W : ) :
                                  Image H W α

                                  Construct an H × W image filled with zeros.

                                  Instances For
                                    def Spec.createZeroMultiChannelImage (α : Type) [Zero α] (C H W : ) :

                                    Construct a C × H × W channels-first image filled with zeros.

                                    Instances For