Residual Linear Block #

This is the smallest “ResNet-like” example in the directory, and it is intentionally chosen to be easy to read.

It shows the structural reason we need the DAG IR without dragging in convolution arithmetic:

y   = Linear(x)
out = ReLU(y + x)

Because x is consumed by both the main path and the skip path, a pure chain would have to recompute the input path or hide sharing inside a special-purpose combinator. In GraphSpec.DAG we express the sharing directly with let1.

This file is best read as a “hello world” for DAG-authored GraphSpec examples:

one explicit parameter ABI,
one shared intermediate,
one multi-input primitive (add),
one final nonlinearity.

References / citations:

He et al. (2016), “Deep Residual Learning for Image Recognition” (ResNets).
NN.GraphSpec.DAG.Core for the term language and semantics.

source

@[reducible, inline]

abbrev NN.GraphSpec.Models.ResidualLinearParams (d : ℕ) :

List Shape

Parameter ABI for the residual block.

The layout is exactly:

W : Mat d d
b : Vec d

The skip path is parameter-free; it simply reuses the input x.

Instances For

source

def NN.GraphSpec.Models.residualLinear (d : ℕ) :

DAG.Model (ResidualLinearParams d) [Shape.Vec d] (Shape.Vec d)

Residual linear block in DAG form.

In ordinary math notation, this is

x ↦ relu((W x + b) + x).

This is a good first DAG example because the only genuinely DAG-specific feature is sharing the input between the main branch and the skip branch.

Instances For