GraphSpec model: CNN (2 convs) #

This is a GraphSpec version of the classic small CNN:

Conv2D → ReLU → MaxPool2D → Conv2D → ReLU → MaxPool2D → Flatten → Linear

Notes:

This is still a chain model (no skip connections), so it fits the sequential GraphSpec core (NN.GraphSpec.Core) and can be built using >>>.
Parameter shapes are derived in the type: (K1,b1,K2,b2,W,b) where K are Conv kernels and W is the linear head matrix.

Shape bookkeeping (why all the Nat arithmetic?) #

The point of GraphSpec is that the shape interface is part of the type. Convolution and pooling therefore bake their output shapes into the type, using the standard formulas:

Conv2D: outH = (inH + 2*padding - kH) / stride + 1 (and similarly for outW)
MaxPool2D: outH = (inH - kH) / stride + 1 (and similarly for outW)

This file defines small helper abbreviations (outH/outW/poolH/poolW) so that the overall classifier head shape (the input dimension to the final Linear) is computed once and reused.

Why not put this under `GraphSpec.DAG`? #

You can express a chain model as a DAG term, but the sequential DSL is the simpler interface for pure pipelines. The DAG language is reserved for models that fundamentally need sharing or multi-input nodes (e.g. residual adds).

Conv/pool output size helpers #

source

@[reducible, inline]

abbrev NN.GraphSpec.Models.outH (inH kH stride padding : ℕ) :

ℕ

Convolution output height formula (standard DL convention).

Instances For

source

@[reducible, inline]

abbrev NN.GraphSpec.Models.outW (inW kW stride padding : ℕ) :

ℕ

Convolution output width formula (standard DL convention).

Instances For

source

@[reducible, inline]

abbrev NN.GraphSpec.Models.poolH (inH kH stride : ℕ) :

ℕ

MaxPool output height formula.

Instances For

source

@[reducible, inline]

abbrev NN.GraphSpec.Models.poolW (inW kW stride : ℕ) :

ℕ

MaxPool output width formula.

Instances For

source

@[reducible, inline]

abbrev NN.GraphSpec.Models.featH (inH kH stride1 padding1 poolKH poolStride1 stride2 padding2 poolStride2 : ℕ) :

ℕ

Final feature-map height after:

conv1 → pool1 → conv2 → pool2.

Instances For

source

@[reducible, inline]

abbrev NN.GraphSpec.Models.featW (inW kW stride1 padding1 poolKW poolStride1 stride2 padding2 poolStride2 : ℕ) :

ℕ

Final feature-map width after:

conv1 → pool1 → conv2 → pool2.

Instances For

source

@[reducible, inline]

abbrev NN.GraphSpec.Models.featSize (c2 inH inW kH kW stride1 padding1 stride2 padding2 poolKH poolKW poolStride1 poolStride2 : ℕ) :

ℕ

Total flattened feature size for the classifier head.

If the second conv produces c2 channels and the final spatial size is featH × featW, then the flattened vector length is (CHW c2 featH featW).size.

Instances For

source

def NN.GraphSpec.Models.cnn2 (inC c1 c2 outDim inH inW kH kW stride1 padding1 stride2 padding2 poolKH poolKW poolStride1 poolStride2 : ℕ) {h_inC : inC ≠ 0} {h_c1 : c1 ≠ 0} {_h_c2 : c2 ≠ 0} {h_kH : kH ≠ 0} {h_kW : kW ≠ 0} {h_poolKH : poolKH ≠ 0} {h_poolKW : poolKW ≠ 0} {h_poolStride1 : poolStride1 ≠ 0} {h_poolStride2 : poolStride2 ≠ 0} :

Graph [Shape.OIHW c1 inC kH kW, Shape.Vec c1, Shape.OIHW c2 c1 kH kW, Shape.Vec c2, Shape.Mat outDim (featSize c2 inH inW kH kW stride1 padding1 stride2 padding2 poolKH poolKW poolStride1 poolStride2), Shape.Vec outDim] (Shape.CHW inC inH inW) (Shape.Vec outDim)

2-conv CNN GraphSpec model.

Conventions:

both conv layers share (kH,kW) but can differ in stride/padding.
pooling uses kernel (poolKH,poolKW) and can differ in stride per pooling site.

Parameter layout (type-level):

(K1,b1) for conv1,
(K2,b2) for conv2,
(W,b) for the final linear head.

Input/output:

input x : CHW inC inH inW (no batch dimension),
output logits : Vec outDim.

Instances For

source

def NN.GraphSpec.Models.cnn2DAGModelZeroInit (inC c1 c2 outDim inH inW kH kW stride1 padding1 stride2 padding2 poolKH poolKW poolStride1 poolStride2 : ℕ) {h_inC : inC ≠ 0} {h_c1 : c1 ≠ 0} {h_c2 : c2 ≠ 0} {h_kH : kH ≠ 0} {h_kW : kW ≠ 0} {h_poolKH : poolKH ≠ 0} {h_poolKW : poolKW ≠ 0} {h_poolStride1 : poolStride1 ≠ 0} {h_poolStride2 : poolStride2 ≠ 0} :

DAG.Model [Shape.OIHW c1 inC kH kW, Shape.Vec c1, Shape.OIHW c2 c1 kH kW, Shape.Vec c2, Shape.Mat outDim (featSize c2 inH inW kH kW stride1 padding1 stride2 padding2 poolKH poolKW poolStride1 poolStride2), Shape.Vec outDim] [Shape.CHW inC inH inW] (Shape.Vec outDim)

The same 2-conv CNN, but exposed as a DAG Model via the structural lowering LowerToDAG.Graph.toDAGModelZeroInit.

This lets “DAG-only” downstream tooling consume this architecture even though it is authored as a sequential Graph pipeline.

Initialization: all-zero parameters (see LowerToDAG.Graph.toDAGModelZeroInit).

Instances For

TorchLean API

NN.GraphSpec.Models.Cnn

GraphSpec model: CNN (2 convs) #

Shape bookkeeping (why all the Nat arithmetic?) #

Why not put this under `GraphSpec.DAG`? #

Conv/pool output size helpers #

GraphSpec model: CNN (2 convs) #

Shape bookkeeping (why all the Nat arithmetic?) #

Why not put this under GraphSpec.DAG? #

Conv/pool output size helpers #

Why not put this under `GraphSpec.DAG`? #