Unet #

U-Net (2-level) model.

This file defines a small U-Net style architecture (a single downsample + upsample):

down path: two Conv2d(3x3, stride=1, padding=1) + ReLU blocks,
downsample: MaxPool2d(kernel=2, stride=2),
bottleneck: two more conv blocks,
upsample: ConvTranspose2d(kernel=2, stride=2),
skip connection: concatenate channels and run two conv blocks,
output head: Conv2d(1x1) to map baseC -> outC.

PyTorch mental model:

this matches the common "U-Net block diagram" but written without a batch axis, so our tensor convention is (C,H,W) rather than (N,C,H,W);
the skip connection concatenates on the channel axis (in PyTorch with a batch axis that would be torch.cat([skip, up], dim=1); here it is concat_dim0_spec because channels are axis 0).

Shape notes:

the 3x3 conv blocks are set up to preserve H×W (stride=1, padding=1),
the pool/upsample pair is the usual 2x down then 2x up, but for odd spatial sizes the ConvTranspose2d formula can produce an off-by-one; we surface this as explicit equalities (h_upH, h_upW) so the caller can pick compatible inH,inW (typically even).

References:

Ronneberger et al., "U-Net: Convolutional Networks for Biomedical Image Segmentation" (MICCAI 2015).

PyTorch docs (for API intuition, not semantics):

torch.nn.Conv2d: https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html
torch.nn.MaxPool2d: https://pytorch.org/docs/stable/generated/torch.nn.MaxPool2d.html
torch.nn.ConvTranspose2d: https://pytorch.org/docs/stable/generated/torch.nn.ConvTranspose2d.html

Configuration #

Architectural hyperparameters live in a dedicated config record.

PyTorch mental model:

this mirrors the way you would pass kernel_size/stride/padding to nn.Conv2d, nn.MaxPool2d, and nn.ConvTranspose2d, plus the base channel width.

source

structure Models.UNet2Config :

Type

U-Net (2-level) architectural hyperparameters (spec layer).

poolKernel : ℕ
kernel_size for the max-pool layer (typical: 2).
poolStride : ℕ
stride for the max-pool layer (typical: 2).
convKernel : ℕ
kernel_size for the 2D conv blocks (typical: 3).
convStride : ℕ
stride for the 2D conv blocks (typical: 1).
convPadding : ℕ
symmetric zero padding for the 2D conv blocks (typical: 1).
upKernel : ℕ
kernel_size for the transposed-convolution upsampler (typical: 2).
upStride : ℕ
stride for the transposed-convolution upsampler (typical: 2).
upPadding : ℕ
padding for the transposed-convolution upsampler (typical: 0).
headKernel : ℕ
kernel_size for the final output head conv (typical: 1).
headStride : ℕ
stride for the final output head conv (typical: 1).
headPadding : ℕ
padding for the final output head conv (typical: 0).
baseC : ℕ
Base channel count (typical: 64).

Instances For

source

structure Models.UNet2Config.WF (cfg : UNet2Config) :

Prop

Well-formedness conditions for UNet2Config (the few nonzero facts needed by layer specs).

poolK_ne0 : cfg.poolKernel ≠ 0
poolStride_ne0 : cfg.poolStride ≠ 0
convK_ne0 : cfg.convKernel ≠ 0
upK_ne0 : cfg.upKernel ≠ 0
headK_ne0 : cfg.headKernel ≠ 0
baseC_pos : cfg.baseC > 0

Instances For

source

def Models.unet2DefaultConfig :

UNet2Config

Canonical "classic U-Net-ish" defaults for our 2-level spec.

Instances For

source

theorem Models.unet2DefaultConfig_wf :

unet2DefaultConfig.WF

unet2DefaultConfig satisfies the nonzero facts required by the spec layer.

source

@[reducible, inline]

abbrev Models.UNetDownH (cfg : UNet2Config) (inH : ℕ) :

ℕ

Output height after MaxPool2d(kernel=2, stride=2) (no padding).

Instances For

source

@[reducible, inline]

abbrev Models.UNetDownW (cfg : UNet2Config) (inW : ℕ) :

ℕ

Output width after MaxPool2d(kernel=2, stride=2) (no padding).

Instances For

source

@[reducible, inline]

abbrev Models.UNetUpH (cfg : UNet2Config) (inH : ℕ) :

ℕ

Output height after MaxPool2d(2,2) then ConvTranspose2d(2,2) (with padding=0).

Instances For

source

@[reducible, inline]

abbrev Models.UNetUpW (cfg : UNet2Config) (inW : ℕ) :

ℕ

Output width after MaxPool2d(2,2) then ConvTranspose2d(2,2) (with padding=0).

Instances For

source

structure Models.UNet2Spec (cfg : UNet2Config) (inC outC inH inW : ℕ) (α : Type) [Context α] [DecidableRel fun (x1 x2 : α) => x1 > x2] (h_inC : inC ≠ 0) (hCfg : cfg.WF) :

Type

2-level U-Net parameter record (spec).

This is a compact U-Net with one downsample and one upsample step:

two conv + ReLU blocks at full resolution (with a skip),
max-pooling, then two conv + ReLU blocks at the lower resolution,
a transposed-conv upsampler,
channel concatenation with the skip feature map,
two more conv + ReLU blocks,
a final 1×1 conv head.

Shape convention: tensors are (C,H,W) (no batch axis).

PyTorch analogue: a small U-Net built from nn.Conv2d, nn.MaxPool2d, nn.ConvTranspose2d, and torch.cat along the channel axis.

down1_1 : Spec.Conv2DSpec inC cfg.baseC cfg.convKernel cfg.convKernel cfg.convStride cfg.convPadding α h_inC ⋯ ⋯
First 3×3 conv in the first down block (inC -> baseC).
down1_2 : Spec.Conv2DSpec cfg.baseC cfg.baseC cfg.convKernel cfg.convKernel cfg.convStride cfg.convPadding α ⋯ ⋯ ⋯
Second 3×3 conv in the first down block (baseC -> baseC).
down2_1 : Spec.Conv2DSpec cfg.baseC (2 * cfg.baseC) cfg.convKernel cfg.convKernel cfg.convStride cfg.convPadding α ⋯ ⋯ ⋯
First 3×3 conv in the bottleneck block (baseC -> 2*baseC).
down2_2 : Spec.Conv2DSpec (2 * cfg.baseC) (2 * cfg.baseC) cfg.convKernel cfg.convKernel cfg.convStride cfg.convPadding α ⋯ ⋯ ⋯
Second 3×3 conv in the bottleneck block (2*baseC -> 2*baseC).
upT : Spec.ConvTranspose2DSpec (2 * cfg.baseC) cfg.baseC cfg.upKernel cfg.upKernel cfg.upStride cfg.upPadding α ⋯ ⋯ ⋯
Transposed-convolution upsampler (2*baseC -> baseC, kernel=2, stride=2).
up1_1 : Spec.Conv2DSpec (cfg.baseC + cfg.baseC) cfg.baseC cfg.convKernel cfg.convKernel cfg.convStride cfg.convPadding α ⋯ ⋯ ⋯
First 3×3 conv after skip concatenation ((baseC+baseC) -> baseC).
up1_2 : Spec.Conv2DSpec cfg.baseC cfg.baseC cfg.convKernel cfg.convKernel cfg.convStride cfg.convPadding α ⋯ ⋯ ⋯
Second 3×3 conv after skip concatenation (baseC -> baseC).
out1x1 : Spec.Conv2DSpec cfg.baseC outC cfg.headKernel cfg.headKernel cfg.headStride cfg.headPadding α ⋯ ⋯ ⋯
Final 1×1 conv head (baseC -> outC).

Instances For

Gradients #

This U-Net is small enough that we can write a fully explicit backward pass in a "mirror the forward" style: rebuild the same intermediates, then walk back through them using the existing layer-level backward specs.

Key details:

concat_dim0_spec is split via concat_dim0_backward_spec,
pooling backward uses max_pool2d_multi_backward_spec,
ReLU is handled via elementwise gating dZ = dY ⊙ ReLU'(Z).

PyTorch analogy:

each conv2d_backward_spec call corresponds to the gradients PyTorch computes for Conv2d(weight,bias);
max_pool2d_multi_backward_spec corresponds to max-pool backward using the argmax locations from the forward (our spec computes it from the inputs).

source

structure Models.UNet2Grads (cfg : UNet2Config) (inC outC inH inW : ℕ) (α : Type) :

Type

Parameter-gradient container for UNet2Spec.

This mirrors the parameter layout of UNet2Spec, recording kernel and bias gradients for each convolution and transposed-convolution layer.

d_down1_1_kernel : Spec.Tensor α (Spec.Shape.dim cfg.baseC (Spec.Shape.dim inC (Spec.Shape.dim cfg.convKernel (Spec.Shape.dim cfg.convKernel Spec.Shape.scalar))))
d down 1 1 kernel.
d_down1_1_bias : Spec.Tensor α (Spec.Shape.dim cfg.baseC Spec.Shape.scalar)
d down 1 1 bias.
d_down1_2_kernel : Spec.Tensor α (Spec.Shape.dim cfg.baseC (Spec.Shape.dim cfg.baseC (Spec.Shape.dim cfg.convKernel (Spec.Shape.dim cfg.convKernel Spec.Shape.scalar))))
d down 1 2 kernel.
d_down1_2_bias : Spec.Tensor α (Spec.Shape.dim cfg.baseC Spec.Shape.scalar)
d down 1 2 bias.
d_down2_1_kernel : Spec.Tensor α (Spec.Shape.dim (2 * cfg.baseC) (Spec.Shape.dim cfg.baseC (Spec.Shape.dim cfg.convKernel (Spec.Shape.dim cfg.convKernel Spec.Shape.scalar))))
d down 2 1 kernel.
d_down2_1_bias : Spec.Tensor α (Spec.Shape.dim (2 * cfg.baseC) Spec.Shape.scalar)
d down 2 1 bias.
d_down2_2_kernel : Spec.Tensor α (Spec.Shape.dim (2 * cfg.baseC) (Spec.Shape.dim (2 * cfg.baseC) (Spec.Shape.dim cfg.convKernel (Spec.Shape.dim cfg.convKernel Spec.Shape.scalar))))
d down 2 2 kernel.
d_down2_2_bias : Spec.Tensor α (Spec.Shape.dim (2 * cfg.baseC) Spec.Shape.scalar)
d down 2 2 bias.
d_upT_kernel : Spec.ConvTransposeKernel cfg.baseC (2 * cfg.baseC) cfg.upKernel cfg.upKernel α
d up T kernel.
d_upT_bias : Spec.Tensor α (Spec.Shape.dim cfg.baseC Spec.Shape.scalar)
d up T bias.
d_up1_1_kernel : Spec.Tensor α (Spec.Shape.dim cfg.baseC (Spec.Shape.dim (cfg.baseC + cfg.baseC) (Spec.Shape.dim cfg.convKernel (Spec.Shape.dim cfg.convKernel Spec.Shape.scalar))))
d up 1 1 kernel.
d_up1_1_bias : Spec.Tensor α (Spec.Shape.dim cfg.baseC Spec.Shape.scalar)
d up 1 1 bias.
d_up1_2_kernel : Spec.Tensor α (Spec.Shape.dim cfg.baseC (Spec.Shape.dim cfg.baseC (Spec.Shape.dim cfg.convKernel (Spec.Shape.dim cfg.convKernel Spec.Shape.scalar))))
d up 1 2 kernel.
d_up1_2_bias : Spec.Tensor α (Spec.Shape.dim cfg.baseC Spec.Shape.scalar)
d up 1 2 bias.
d_out1x1_kernel : Spec.Tensor α (Spec.Shape.dim outC (Spec.Shape.dim cfg.baseC (Spec.Shape.dim cfg.headKernel (Spec.Shape.dim cfg.headKernel Spec.Shape.scalar))))
d out 1 x 1 kernel.
d_out1x1_bias : Spec.Tensor α (Spec.Shape.dim outC Spec.Shape.scalar)
d out 1 x 1 bias.

Instances For

source

def Models.UNet2Spec.forward {α : Type} [Context α] [DecidableRel fun (x1 x2 : α) => x1 > x2] {cfg : UNet2Config} {inC outC inH inW : ℕ} {h_inC : inC ≠ 0} {hCfg : cfg.WF} (m : UNet2Spec cfg inC outC inH inW α h_inC hCfg) (x : Spec.MultiChannelImage inC inH inW α) (h_convH : (inH + 2 * cfg.convPadding - cfg.convKernel) / cfg.convStride + 1 = inH) (h_convW : (inW + 2 * cfg.convPadding - cfg.convKernel) / cfg.convStride + 1 = inW) (h_convH_down : (UNetDownH cfg inH + 2 * cfg.convPadding - cfg.convKernel) / cfg.convStride + 1 = UNetDownH cfg inH) (h_convW_down : (UNetDownW cfg inW + 2 * cfg.convPadding - cfg.convKernel) / cfg.convStride + 1 = UNetDownW cfg inW) (h_upH : UNetUpH cfg inH = inH) (h_upW : UNetUpW cfg inW = inW) (h_outH : (inH + 2 * cfg.headPadding - cfg.headKernel) / cfg.headStride + 1 = inH) (h_outW : (inW + 2 * cfg.headPadding - cfg.headKernel) / cfg.headStride + 1 = inW) :

Spec.MultiChannelImage outC inH inW α

Forward pass for UNet2Spec.

Inputs/outputs use MultiChannelImage tensors of shape (C,H,W) (no batch axis).

The many h_* equalities are shape-rewrite hints: layer specs compute output sizes using explicit arithmetic (matching PyTorch's formulas), and these equalities let callers assert "this 3×3 conv preserves spatial size" or "pool then upsample returns to the original size" for a particular choice of inH,inW (typically even).

Instances For

source

def Models.UNet2Spec.backward {α : Type} [Context α] [DecidableRel fun (x1 x2 : α) => x1 > x2] {cfg : UNet2Config} {inC outC inH inW : ℕ} {h_inC : inC ≠ 0} {hCfg : cfg.WF} (m : UNet2Spec cfg inC outC inH inW α h_inC hCfg) (x : Spec.MultiChannelImage inC inH inW α) (grad_output : Spec.MultiChannelImage outC inH inW α) (h_convH : (inH + 2 * cfg.convPadding - cfg.convKernel) / cfg.convStride + 1 = inH) (h_convW : (inW + 2 * cfg.convPadding - cfg.convKernel) / cfg.convStride + 1 = inW) (h_convH_down : (UNetDownH cfg inH + 2 * cfg.convPadding - cfg.convKernel) / cfg.convStride + 1 = UNetDownH cfg inH) (h_convW_down : (UNetDownW cfg inW + 2 * cfg.convPadding - cfg.convKernel) / cfg.convStride + 1 = UNetDownW cfg inW) (h_upH : UNetUpH cfg inH = inH) (h_upW : UNetUpW cfg inW = inW) (h_outH : (inH + 2 * cfg.headPadding - cfg.headKernel) / cfg.headStride + 1 = inH) (h_outW : (inW + 2 * cfg.headPadding - cfg.headKernel) / cfg.headStride + 1 = inW) :

UNet2Grads cfg inC outC inH inW α × Spec.MultiChannelImage inC inH inW α

Backward pass for UNet2Spec.forward.

Given:

the model parameters m,
the forward input image x,
an upstream gradient grad_output = dL/dy, returns:
parameter gradients (UNet2Grads), and
the gradient w.r.t. the input image (dL/dx).

Implementation note: this is an explicit "recompute intermediates then walk backward" spec (no mutable tape), mirroring the math behind PyTorch autograd and standard conv/pool backward rules.

Instances For

TorchLean API

NN.Spec.Models.Unet

Unet #

Configuration #

Gradients #