GraphSpec DAG Entry Point #
Umbrella import for the canonical general DAG-shaped GraphSpec surface.
If you are building:
- a plain pipeline like
Linear >>> ReLU >>> Linear, author it withNN.GraphSpec.Coreand lower it to DAG when needed; - anything with skip connections, shared subexpressions, or true multi-input nodes, author it
directly with
NN.GraphSpec.DAG.
This import gives you the DAG term language plus the standard DAG-side primitive pack. The GraphSpec-specific example architectures live under:
So the intended reading order is:
NN.GraphSpec.Corefor the small sequential surface,NN.GraphSpec.DAG.Corewhen you need explicit sharing,NN.GraphSpec.Modelsfor concrete examples.
Where are the DAG primitives? #
They live in this entrypoint, not in a separate NN.GraphSpec.DAG.Primitives module. That keeps the
DAG surface compact and keeps primitive constructors next to the public DAG import.
The key dependency reason is that DAG.Core defines the term calculus and is imported by the
sequential GraphSpec.Core lowering code. The DAG primitive pack, however, is mostly derived from
sequential primitives such as Primitive.linear, Primitive.relu, and Primitive.conv2d. Putting
those derived definitions into DAG.Core would create an import cycle:
DAG.Core → GraphSpec.Core → DAG.Core.
So the split is:
NN.GraphSpec.DAG.Core: calculus only (PrimOp,Term,Args,Model, eval/compile).NN.GraphSpec.DAG: public DAG entrypoint plus the derived primitive constructors.
Basic DAG primitives #
Dense linear layer in DAG form.
Inputs are ordered as [W, b, x]:
W : Mat outDim inDim,b : Vec outDim,x : Vec inDim.
The output is Vec outDim. This is the DAG embedding of Primitive.linear, so the DAG and
sequential authoring surfaces share the same Spec semantics and TorchLean compiler path.
Instances For
Flatten a tensor to a one-dimensional vector in DAG form.
Input: [x : Tensor s].
Output: Vec (Shape.size s).
This is the DAG embedding of Primitive.flatten, so it has exactly the same row-major view
semantics as the sequential primitive.
Instances For
Vision / residual DAG primitives #
ReLU activation in DAG form.
Input: [x : s], output: s.
Semantics: elementwise max(x, 0). This is parameter-free and derived from Primitive.relu.
Reference: Nair and Hinton (2010), "Rectified Linear Units Improve Restricted Boltzmann Machines".
Instances For
Add two tensors of the same shape.
Input shapes: [s, s], output shape: s.
This is the primitive used for residual/skip connections: out = main(x) + x. It is defined
directly because the sequential surface is unary, while residual addition is genuinely multi-input.
Instances For
2D convolution in DAG form, using channel-first CHW tensors without an explicit batch dimension.
Inputs are ordered as [kernel, bias, x]:
kernel : OIHW outC inC kH kW,bias : Vec outC,x : CHW inC inH inW.
The output shape uses the standard convolution formula:
outH = (inH + 2 * padding - kH) / stride + 1
and similarly for outW. This is derived from the sequential Primitive.conv2d.
Instances For
Max pooling in DAG form for channel-first CHW tensors.
Input: [x : CHW inC inH inW].
Output shape uses the standard pooling formula:
outH = (inH - kH) / stride + 1
and similarly for outW. This is derived from the sequential Primitive.maxPool2d.
Instances For
Batch normalization on CHW tensors in DAG form.
Inputs are [gamma, beta, x] where gamma,beta : Vec channels and
x : CHW channels height width.
This version models the learnable affine parameters but does not carry running mean/variance state in the graph; stateful training statistics belong in an explicit runtime/state model.
Reference: Ioffe and Szegedy (2015), "Batch Normalization: Accelerating Deep Network Training...".
Instances For
Global average pooling over spatial dimensions for CHW tensors.
Input: [x : CHW c h w].
Output: Vec c, where each channel is averaged over the h x w grid.
Reference: Lin, Chen, and Yan (2013), "Network In Network".