TorchLean API

NN.GraphSpec.Models.Resnet18

GraphSpec ResNet-18 #

This file defines a ResNet-18–style convolutional network using GraphSpec's general DAG IR, with BasicBlocks, projection shortcuts, shape-indexed parameters, and TorchLean compilation support.

Why This Is A DAG Model #

Classic ResNet blocks are not purely sequential: the input x flows down two paths:

  1. a “main” path (Conv → BN → ReLU → Conv → BN),
  2. a “skip” path (identity, or a learned projection when shapes change),

and then they are added. In a chain-only representation you either:

GraphSpec’s DAG IR takes a different approach: it provides a small SSA-like term language (Term + Args) that can naturally express sharing. The semantics are:

ResNet-18 here is written once, then we get:

Model Scope #

This is a “CHW, no batch” variant (C×H×W), matching the rest of the Spec/TorchLean vision layers. It is faithful to the core ResNet-18 structure:

The state records exactly the metadata needed for parameter allocation:

Shapes And Type-Level Arithmetic #

The main practical challenge is typing the residual adds:

We solve this by defining a small family of typed primitives that cast the “raw” conv/pool output shapes into a canonical downsample formula:

down2(h) = (h - 1) / 2 + 1

This is the standard stride-2 output formula for kernels with effective receptive field 1 or 3 when you choose padding the usual ResNet way:

By enforcing this “canonical” output shape, the residual add becomes definitional/typecheckable without introducing runtime reshapes.

References / citations:

@[reducible, inline]

Convenient local abbreviation for channel-first image tensors (C × H × W).

Instances For

    Note on parameter indexing #

    Inside model, the environment is Γ = params ++ [x], so parameters are accessed by index.

    We avoid clever macros here. In practice, being explicit is faster to debug and (importantly) more stable under refactors: each use site carries its own proof that the index points at the expected shape.

    Canonical stride-2 downsample formula #

    @[reducible, inline]

    Canonical stride-2 output-size formula used throughout this file.

    We write it once and reuse it for the stem, the downsampling residual blocks, and the max-pool so that all of those paths literally agree on the same type-level height/width expression.

    Instances For

      down2 is always positive.

      Small typed primitives for ResNet typing #

      theorem NN.GraphSpec.Models.ResNet18.conv3_same_out_eq {h : } (hh : h > 0) :
      (h + 2 * 1 - 3) / 1 + 1 = h

      Stride-1, padding-1 3×3 conv preserves the spatial size for h > 0.

      def NN.GraphSpec.Models.ResNet18.conv3x3Same (c h w : ) (h_c : c 0) (hh : h > 0) (hw : w > 0) :

      Typed 3×3 convolution (stride 1, padding 1) whose output is cast to the canonical imgShape c h w.

      Instances For
        def NN.GraphSpec.Models.ResNet18.conv7x7Down (inC outC h w : ) (h_inC : inC 0) (_h_outC : outC 0) :
        DAG.PrimOp [Shape.OIHW outC inC 7 7, Shape.Vec outC, imgShape inC h w] (imgShape outC (down2 h) (down2 w))

        Typed 7×7 stem convolution (stride 2, padding 3), cast to the canonical down2 spatial sizes.

        Instances For
          def NN.GraphSpec.Models.ResNet18.conv3x3Down (inC outC h w : ) (h_inC : inC 0) (_h_outC : outC 0) :
          DAG.PrimOp [Shape.OIHW outC inC 3 3, Shape.Vec outC, imgShape inC h w] (imgShape outC (down2 h) (down2 w))

          Typed 3×3 convolution (stride 2, padding 1), cast to the canonical down2 spatial sizes.

          Instances For
            def NN.GraphSpec.Models.ResNet18.conv1x1Down (inC outC h w : ) (h_inC : inC 0) (_h_outC : outC 0) :
            DAG.PrimOp [Shape.OIHW outC inC 1 1, Shape.Vec outC, imgShape inC h w] (imgShape outC (down2 h) (down2 w))

            Typed 1×1 projection convolution (stride 2), cast to the canonical down2 spatial sizes.

            Instances For

              Typed 3×3 max-pool (stride 2, padding 1), cast to the canonical down2 spatial sizes.

              Instances For

                Parameter layout #

                @[reducible, inline]

                Parameter layout for a convolution layer: kernel first, then bias.

                Instances For
                  @[reducible, inline]

                  Parameter layout for affine batch norm: (gamma, beta).

                  Instances For
                    @[reducible, inline]

                    Parameter ABI for a single BasicBlock.

                    When downsample = true, this includes the projection shortcut (1×1 conv + BN) parameters after the two main-path conv/BN pairs.

                    Instances For
                      @[reducible, inline]

                      Parameter ABI for one ResNet-18 stage.

                      A stage is two BasicBlocks. The first block downsamples exactly when inC ≠ outC; the second always keeps the same number of channels.

                      Instances For
                        @[reducible, inline]

                        Full parameter ABI for the GraphSpec ResNet-18 model.

                        This list is deliberately written in a flat, explicit order. The model body indexes parameters by closed numerals, and the flat ABI keeps those index proofs predictable for simp and easy to audit.

                        Instances For
                          @[simp]
                          theorem NN.GraphSpec.Models.ResNet18.params_length (inC numClasses : ) :
                          (params inC numClasses).length = 82

                          The ResNet-18 GraphSpec ABI contains exactly 82 parameter tensors.

                          This is a small but useful structural theorem: the executable wrapper, deterministic initializer, and DAG body all rely on the same flat ABI. Keeping the count as a named theorem makes accidental parameter-layout edits visible during review instead of hiding them inside a local simp.

                          Deterministic initialization #

                          def NN.GraphSpec.Models.ResNet18.initConv (outC inC kH kW seedK seedB : ) :

                          Deterministically initialize a convolution kernel (uniform) and bias (zeros).

                          Instances For

                            Deterministically initialize BatchNorm parameters (gamma, beta) as (ones, zeros).

                            Instances For
                              def NN.GraphSpec.Models.ResNet18.initLinear (inDim outDim seedW seedB : ) :

                              Deterministically initialize linear weights (uniform) and bias (zeros).

                              Instances For

                                Model #

                                def NN.GraphSpec.Models.ResNet18.model (inC h w numClasses : ) (h_inC : inC > 0) (_h_h : h > 0) (_h_w : w > 0) (_h_cls : numClasses > 0) :
                                DAG.Model (params inC numClasses) [imgShape inC h w] (Shape.Vec numClasses)

                                GraphSpec ResNet-18 model.

                                This is the public entrypoint for the DAG-authored ResNet in this directory. It packages together:

                                • the full typed parameter ABI (params inC numClasses),
                                • deterministic initialization for all 82 parameter tensors,
                                • and a DAG body whose pure semantics can be interpreted via DAG.Model.specFwd and compiled via DAG.Model.torchProgram.

                                The model is channel-first (CHW) and batch-free, matching the rest of TorchLean's vision-side Spec and runtime layers.

                                Instances For