CNN (spec model) #
This file wires together a small CNN in two styles:
- A
SpecChaindescription, which is great for "model wiring" and exporters:
cnn_spec:Conv2D → MaxPool2D → Conv2D → MaxPool2D → Flatten → Linearcnn_with_relu_spec: same, but with ReLU inserted after each conv
- A fully explicit forward/backward pair (
Models.Full.CNN2Spec) for the classic training setup:
Conv → ReLU → MaxPool → Conv → ReLU → MaxPool → Flatten → Linear
PyTorch mental model (single image, no batch):
nn.Sequential(
nn.Conv2d(inC, c1, (kH,kW), stride=stride1, padding=padding1),
nn.ReLU(),
nn.MaxPool2d((poolKH,poolKW), stride=poolStride1),
nn.Conv2d(c1, c2, (kH,kW), stride=stride2, padding=padding2),
nn.ReLU(),
nn.MaxPool2d((poolKH,poolKW), stride=poolStride2),
nn.Flatten(),
nn.Linear(c2 * H2 * W2, outDim),
)
All shapes are tracked at the type level; the feature dimension for the final LinearSpec is
computed as Shape.size of the post-pooling feature map.
This is intended as a reference/specification of model structure, not as a tuned implementation.
Output size for a conv along one spatial axis.
Matches the standard PyTorch formula (for dilation = 1, groups = 1):
out = (in + 2*padding - k) / stride + 1.
Instances For
Output size for a pooling op along one spatial axis (no padding).
Matches the standard formula:
out = (in - k) / stride + 1.
Instances For
Output height after the first convolution stage.
Instances For
Output width after the first convolution stage.
Instances For
Output height after the first pooling stage.
Instances For
Output width after the first pooling stage.
Instances For
Output height after the second convolution stage (after pool1).
Instances For
Output width after the second convolution stage (after pool1).
Instances For
Output height after the second pooling stage.
Instances For
Output width after the second pooling stage.
Instances For
Feature-map shape after the second pooling stage: (c2, H2, W2).
Instances For
Flattened feature size after the second pooling stage: c2 * H2 * W2.
Instances For
CNN SpecChain wiring (no activations): Conv2D -> MaxPool2D -> Conv2D -> MaxPool2D -> Flatten -> Linear.
If you want the "classic" ReLU-after-conv variant, use cnn_with_relu_spec.
Instances For
A slightly more "classic" CNN SpecChain with ReLU after each convolution.
PyTorch analogy: insert nn.ReLU() between the conv and pool blocks.
Instances For
Run cnn_spec forward on a single input image.
This is a convenience wrapper around SpecChain.forward.
Instances For
A fully explicit CNN spec with backward pass #
The SpecChain wiring above is convenient for model diagrams. For training/verification workflows
we also want an
explicit reverse-mode spec that returns parameter gradients.
This section provides a small CNN in the classic:
Conv → ReLU → MaxPool → Conv → ReLU → MaxPool → Flatten → Linear
form, with a complete backward pass using the per-layer backward specs:
conv2d_backward_spec,max_pool2d_multi_backward_spec,linear_backward_spec,- elementwise gating for ReLU.
Configuration #
The explicit CNN2Spec below is intended to be a readable, end-to-end training reference.
To keep call sites ergonomic (and avoid hiding numeric architecture choices in long argument lists),
we bundle the architectural hyperparameters into a config record.
Hyperparameters for the small 2-layer CNN (CNN2Spec).
- c1 : ℕ
Channels after the first conv.
- c2 : ℕ
Channels after the second conv.
- outDim : ℕ
Output dimension of the linear head (e.g. number of classes).
- kH : ℕ
Convolution kernel height.
- kW : ℕ
Convolution kernel width.
- stride1 : ℕ
Stride of the first conv.
- padding1 : ℕ
Padding of the first conv.
- stride2 : ℕ
Stride of the second conv.
- padding2 : ℕ
Padding of the second conv.
- poolKH : ℕ
Pooling kernel height.
- poolKW : ℕ
Pooling kernel width.
- poolStride1 : ℕ
Stride of the first pooling op.
- poolStride2 : ℕ
Stride of the second pooling op.
Instances For
Well-formedness conditions for CNN2Config (the nonzero facts needed by some layer specs).
Instances For
Default CNN2Config (a small "classic" CNN shape).
Instances For
cnn2DefaultConfig is well-formed.
A small 2-block CNN with an explicit linear head.
Parameters:
conv1 : Conv2d(inC -> c1)conv2 : Conv2d(c1 -> c2)pool1,pool2 : MaxPool2d(no padding in this spec)head : Linear(c2 * H2 * W2 -> outDim)
PyTorch mental model:
Conv → ReLU → MaxPool → Conv → ReLU → MaxPool → Flatten → Linear.
This structure is used by forward and backward below.
- pool1 : Spec.MaxPool2DSpec cfg.poolKH cfg.poolKW cfg.poolStride1 ⋯ ⋯ ⋯
- pool2 : Spec.MaxPool2DSpec cfg.poolKH cfg.poolKW cfg.poolStride2 ⋯ ⋯ ⋯
- head : Spec.LinearSpec α (CNN.featSize cfg.c2 inH inW cfg.kH cfg.kW cfg.stride1 cfg.padding1 cfg.stride2 cfg.padding2 cfg.poolKH cfg.poolKW cfg.poolStride1 cfg.poolStride2) cfg.outDim
Instances For
Gradients for CNN2Spec parameters (returned by CNN2Spec.backward).
- d_conv1_kernel : Spec.Tensor α (Spec.Shape.dim cfg.c1 (Spec.Shape.dim inC (Spec.Shape.dim cfg.kH (Spec.Shape.dim cfg.kW Spec.Shape.scalar))))
- d_conv1_bias : Spec.Tensor α (Spec.Shape.dim cfg.c1 Spec.Shape.scalar)
- d_conv2_kernel : Spec.Tensor α (Spec.Shape.dim cfg.c2 (Spec.Shape.dim cfg.c1 (Spec.Shape.dim cfg.kH (Spec.Shape.dim cfg.kW Spec.Shape.scalar))))
- d_conv2_bias : Spec.Tensor α (Spec.Shape.dim cfg.c2 Spec.Shape.scalar)
- d_head_W : Spec.Tensor α (Spec.Shape.dim cfg.outDim (Spec.Shape.dim (CNN.featSize cfg.c2 inH inW cfg.kH cfg.kW cfg.stride1 cfg.padding1 cfg.stride2 cfg.padding2 cfg.poolKH cfg.poolKW cfg.poolStride1 cfg.poolStride2) Spec.Shape.scalar))
- d_head_b : Spec.Tensor α (Spec.Shape.dim cfg.outDim Spec.Shape.scalar)
Instances For
Forward pass for CNN2Spec.
This is the "full implementation" version of the model, written directly in terms of the
layer-level specs, rather than via SpecChain.
Instances For
Backward pass (reverse-mode / VJP) for CNN2Spec.
Returns parameter gradients plus the input gradient.
Each step uses the corresponding layer backward spec:
linear_backward_specmax_pool2d_multi_backward_specconv2d_backward_spec
and ReLU uses the standard pointwise gate dY = dR ⊙ relu'(preact).