Image/tensor utilities (spec layer) #

Convenience aliases and helpers for 2‑D images (H×W) and multi‑channel images (C×H×W), plus padding and window‑extraction utilities used by conv/pooling layers.

source

@[reducible, inline]

abbrev Spec.Image (H W : ℕ) (α : Type) :

Type

A 2-D image tensor of shape [H, W].

Instances For

source

@[reducible, inline]

abbrev Spec.MultiChannelImage (C H W : ℕ) (α : Type) :

Type

A C-channel image tensor of shape [C, H, W] (channels-first, like PyTorch NCHW without N).

Instances For

source

@[reducible, inline]

abbrev Spec.Signal (L : ℕ) (α : Type) :

Type

A 1-D signal tensor of shape [L].

Instances For

source

@[reducible, inline]

abbrev Spec.MultiChannelSignal (C L : ℕ) (α : Type) :

Type

A C-channel 1-D signal tensor of shape [C, L] (channels-first).

Instances For

source

@[reducible, inline]

abbrev Spec.Volume (D H W : ℕ) (α : Type) :

Type

A 3-D volume tensor of shape [D, H, W].

Instances For

source

@[reducible, inline]

abbrev Spec.MultiChannelVolume (C D H W : ℕ) (α : Type) :

Type

A C-channel 3-D volume tensor of shape [C, D, H, W] (channels-first).

Instances For

source

def Spec.rwMultiChannelImage {α : Type} {C1 C2 H1 H2 W1 W2 : ℕ} (img : MultiChannelImage C1 H1 W1 α) (h1 : C1 = C2) (h2 : H1 = H2) (h3 : W1 = W2) :

MultiChannelImage C2 H2 W2 α

Cast a MultiChannelImage along definitional equalities of its channel/height/width indices.

This is a dependent-type convenience: it does not change the underlying tensor data, only the type-level shape indices.

Instances For

source

def Spec.rwMultiChannelImageExplicit {α : Type} {C1 H1 W1 : ℕ} (C2 H2 W2 : ℕ) (img : MultiChannelImage C1 H1 W1 α) (h1 : C1 = C2) (h2 : H1 = H2) (h3 : W1 = W2) :

MultiChannelImage C2 H2 W2 α

Explicit-argument version of rw_multi_channel_image.

This is occasionally convenient when elaboration has trouble inferring C2/H2/W2 from context.

Instances For

source

def Spec.getValueAtPosition {α : Type} [Context α] {H W : ℕ} (img : Image H W α) (x y : ℕ) :

Tensor α Shape.scalar

Read pixel (x, y) from an Image, returning 0 when out of bounds.

This helper is used by window-extraction and padding utilities for conv/pooling specs.

Instances For

source

theorem Spec.get_at_or_zero_getValueAtPosition {α : Type} [Context α] {H W : ℕ} (img : Image H W α) (x y : ℕ) :

getAtOrZero (getValueAtPosition img x y) [] = getAtOrZero img [x, y]

getValueAtPosition agrees with the generic list-indexing helper get_at_or_zero.

In particular, reading a scalar via the specialized (x, y) accessor is the same as reading with indices [x, y], where both return 0 out of bounds.

source

def Spec.extractWindow {α : Type} [Context α] {H W : ℕ} (kW kH : ℕ) (img : Image H W α) (start_i start_j : ℕ) :

Tensor α (Shape.dim kH (Shape.dim kW Shape.scalar))

Extract a kH × kW patch from an image starting at (start_i, start_j).

Out-of-bounds pixels are treated as 0, matching the behavior of getValueAtPosition. This is spec-level "im2col"-style logic (cf. PyTorch nn.Unfold, conceptually).

Instances For

source

def Spec.padMultiChannel {α : Type} [Context α] {inC inH inW : ℕ} (img : MultiChannelImage inC inH inW α) (padding : ℕ) :

MultiChannelImage inC (inH + 2 * padding) (inW + 2 * padding) α

Zero-pad a channels-first image by padding pixels on each spatial axis.

This is the spec analogue of torch.nn.functional.pad (with constant 0 padding). The output shape is [inC, inH + 2*padding, inW + 2*padding].

Instances For

source

theorem Spec.get_at_or_zero_pad_multi_channel {α : Type} [Context α] {inC inH inW padding : ℕ} (img : MultiChannelImage inC inH inW α) (c : Fin inC) (p q : ℕ) :

getAtOrZero (padMultiChannel img padding) [↑c, p, q] = if _h : p < padding ∨ q < padding then 0 else getAtOrZero img [↑c, p - padding, q - padding]

Characterization lemma for pad_multi_channel under list-indexing (get_at_or_zero).

Reading the padded tensor at [c, p, q] yields 0 in the top/left padding region, and otherwise reads the original tensor at [c, p - padding, q - padding] (with out-of-bounds falling back to 0 on both sides).

source

theorem Spec.get_at_or_zero_pad_multi_channel_shift {α : Type} [Context α] {inC inH inW padding : ℕ} (img : MultiChannelImage inC inH inW α) (c : Fin inC) (i : Fin inH) (j : Fin inW) :

getAtOrZero (padMultiChannel img padding) [↑c, ↑i + padding, ↑j + padding] = getAtOrZero img [↑c, ↑i, ↑j]

Index-shift lemma for pad_multi_channel.

If (i, j) is in-bounds for the original image, then reading the padded image at (i + padding, j + padding) returns the same value.

source

def Spec.extractMultiWindow {α : Type} [Context α] {inC kH kW inH inW padding : ℕ} (img : MultiChannelImage inC (inH + 2 * padding) (inW + 2 * padding) α) (start_i start_j : ℕ) :

Tensor α (Shape.dim inC (Shape.dim kH (Shape.dim kW Shape.scalar)))

Extract a kH × kW window from each channel of a channels-first image.

The input is typically a padded image, and the result has shape [inC, kH, kW].

Instances For

source

def Spec.padChannelsZero {α : Type} [Zero α] {inChannels outChannels height width : ℕ} (_h : inChannels ≤ outChannels) (img : MultiChannelImage inChannels height width α) :

MultiChannelImage outChannels height width α

Increase the channel dimension by zero-padding extra channels.

This is used in some ResNet-style skip connections when inChannels < outChannels. Existing channels are copied; newly introduced channels are identically zero.

Instances For

source

def Spec.channelIdentity {α : Type} {channels height width : ℕ} (img : MultiChannelImage channels height width α) :

MultiChannelImage channels height width α

Identity on MultiChannelImage (useful as a "no-op" branch in higher-level specs).

Instances For

source

def Spec.setValueAtPosition {α : Type} {H W : ℕ} (img : Image H W α) (x y : ℕ) (value : α) :

Image H W α

Write a value at pixel (x, y) if it is in-bounds; otherwise return the original image.

This uses update_tensor_spec under the hood and is intended for small spec-level utilities.

Instances For

source

def Spec.addValueAtPosition {α : Type} [Add α] {H W : ℕ} (img : Image H W α) (x y : ℕ) (value : α) :

Image H W α

Add value to pixel (x, y) if it is in-bounds; otherwise return the original image.

This is a small helper for accumulation-style specs (e.g. naive convolution).

Instances For

source

def Spec.createZeroImage {α : Type} [Zero α] (H W : ℕ) :

Image H W α

Construct an H × W image filled with zeros.

Instances For

source

def Spec.createZeroMultiChannelImage (α : Type) [Zero α] (C H W : ℕ) :

MultiChannelImage C H W α

Construct a C × H × W channels-first image filled with zeros.

Instances For

TorchLean API

NN.Spec.Layers.Utils

Image/tensor utilities (spec layer) #