Semantics #
Denotational semantics for NN.IR.Graph.
This file defines an evaluator for the current IR fragment:
- it evaluates nodes in SSA/topological order,
- each node applies the corresponding spec-layer tensor operation to its parents,
- parameter payloads for
const,linear, andconv2dare supplied by an explicitPayload.
The evaluator is total on well-formed, well-shaped graphs and returns Except String on malformed
graphs or missing payloads.
Softmax and layer norm:
softmax axisis interpreted as softmax along the givenaxis, but the spec primitive we have is last-axis softmax (Activation.softmax_spec). We therefore interpret non-last-axis softmax by permuting the requested axis to the last position, applying last-axis softmax, then permuting back. This matches the meaning oftorch.softmax(x, dim=axis)in PyTorch.layernorm axismatches PyTorch'sF.layer_norm(x, normalized_shape=x.shape[axis:])convention:axisselects the start of the normalized suffix. We implement this by reshaping the tensor into a 2D view(seqLen, embedDim), applying the spec 2D LayerNorm (Spec.layerNorm), then reshaping back. This keeps the spec primitive small while supporting arbitrary ranks.
How this relates to PyTorch:
Graph.nodesis analogous to a topologically-sorted IR like FX/TorchScript.Payloadis analogous to “parameters / buffers / constants” that live outside the pure graph structure.- The evaluator is a pure, denotational model of running the graph. It is designed for clarity and for connecting to proofs and verification passes (not for performance).
References / related systems:
- PyTorch FX: https://pytorch.org/docs/stable/fx.html
- TorchScript: https://pytorch.org/docs/stable/jit.html
- ONNX (graph + initializers): https://onnx.ai/
Parameter payloads #
The IR graph stores OpKind and outShape, but it does not embed tensor values for parameters.
Instead, evaluation is parameterized by a Payload:
const? idsupplies a constant tensor for aconstnode,linear? idsuppliesW,bfor alinearnode,conv2d? idsupplies the convolution spec/weights for aconv2dnode.
This matches how most graph formats work in practice: structure is one artifact, parameters are another.
Payload record for a const node.
Constants are stored in a “flat” (1-D) representation so backends can keep a uniform container
(e.g. an array). During evaluation we check the flat length against Shape.size and then
unflatten to the requested Shape.
- n : ℕ
Number of scalar entries stored in the flat constant payload.
- v : Spec.Tensor α (Spec.Shape.dim self.n Spec.Shape.scalar)
Constant values stored as a vector before evaluation reshapes them to the IR node shape.
Instances For
Payload record for a linear node: weight matrix W and bias vector b.
The node's input x comes from the graph edge; W,b live in the external Payload (similar to
ONNX initializers or a PyTorch state_dict).
- outDim : ℕ
Output dimension.
- inDim : ℕ
Input dimension.
- W : Spec.Tensor α (Spec.Shape.dim self.outDim (Spec.Shape.dim self.inDim Spec.Shape.scalar))
- b : Spec.Tensor α (Spec.Shape.dim self.outDim Spec.Shape.scalar)
Bias vector added after matrix-vector multiplication.
Instances For
Payload record for a conv2d node.
We store the spec-layer Conv2DSpec together with the dimension parameters needed to reconstruct
it. The nonzero proofs are required by the spec-layer definition and ensure the convolution is
well-formed.
- inC : ℕ
Input channels.
- outC : ℕ
Output channels.
- kH : ℕ
Kernel height.
- kW : ℕ
Kernel width.
- stride : ℕ
Stride.
- padding : ℕ
Padding size.
- inH : ℕ
in H.
- inW : ℕ
Input width.
Proof that the input channel count is nonzero, required by the spec convolution layer.
Proof that the kernel height is nonzero.
Proof that the kernel width is nonzero.
Spec-layer convolution package containing weights, bias, and convolution metadata.
Instances For
External parameter payloads keyed by IR node id.
This is deliberately small; different backends may store parameters differently.
- conv2d? : ℕ → Option (Conv2DParams α)
Instances For
Dynamic (shape-tagged) values #
During evaluation we keep values in a dependent pair Σ s, Tensor α s so we can store a
heterogenous
table of intermediate tensors while still recovering precise shapes when we need them.
Dynamic (shape-tagged) tensor value used by the IR evaluator.
This is a dependent pair Σ s, Tensor α s, which lets us store heterogeneously-shaped intermediate
values in one table while still recovering exact shapes when needed.
Instances For
The shape tag carried by a dynamic value.
Instances For
The underlying tensor, with its shape recovered from the dependent pair.
Instances For
Construct a dynamic value from a shape and a tensor of that shape.
Instances For
Small proof-helpers used by evaluation #
The evaluator frequently needs evidence that an axis is valid or that a broadcast is legal so it can
call the spec-layer operations, which are typed with these preconditions. We build these witnesses
from runtime data (Nat axis values and shapes) using Option:
- returning
nonemeans the IR node is ill-formed for the given shapes, - returning
some hprovides the witness needed to call the spec operator.
Build a witness that axis is a valid axis for shape s.
Many spec-layer ops (e.g. reductions, softmax, layernorm) are typed with a Shape.valid_axis
precondition. Since the IR stores axes as raw Nat, we reconstruct the witness at runtime.
Returns none when axis is out of bounds.
Instances For
Build a witness that s₁ can be broadcast to s₂ (NumPy/PyTorch-style broadcasting).
The spec-layer broadcasting operator is typed with Shape.CanBroadcastTo. Since the IR stores only
runtime shapes, we reconstruct this witness on demand.
Returns none when broadcasting is not possible.
Instances For
Compute a sequence of adjacent swaps that realizes a target permutation.
This is used to implement .permute by repeatedly applying swapAdjacentAtDepth, which is already
available in the spec tensor library. If the permutation is ill-formed, this returns an error
explaining what went wrong.
Instances For
Apply one adjacent-swap-at-depth to a dynamic tensor value.
This is the execution-level building block used to implement .permute in terms of repeated
adjacent swaps, reusing the spec tensor library's swapAdjacentAtDepth.
Instances For
Permute a dynamic tensor value according to perm.
This checks that perm is a valid permutation for the input shape (using Shape.permute?), then
lowers it to a sequence of adjacent swaps and applies them to the tensor.
Instances For
Evaluation helpers #
The evaluator itself (evalAt / denoteAll) is a fold over nodes. These helpers keep the fold
readable:
expectShapeenforces “dynamic shape agrees with declaredoutShape” at each step.evalConst/evalLinear/evalConv2Dfetch and apply external payloads keyed by node id.
Check a dynamic value has the expected shape and return it as a statically-typed tensor.
Instances For
Evaluate MSE loss on two dynamic values, checking that their runtime shapes agree.
Instances For
Transport a Tensor α (dim n scalar) across an equality n = n' (helper for payload casts).
Instances For
Evaluate a const node from the external payload.
Constants are stored “flat” (1D) for convenience, so we check the flattened length matches
Shape.size s and then unflatten to the requested shape.
Instances For
Evaluate a linear node from the external payload.
We enforce:
- the input dynamic value has shape
(inDim), and - the node's declared outShape matches
(outDim).
The actual math is the usual affine map: y = W·x + b.
Instances For
Evaluate a conv2d node from the external payload.
The output shape is computed with the standard (no dilation) formula:
out = ⌊(in + 2*pad - k)/stride⌋ + 1 for each spatial dimension.
Instances For
Deterministic LayerNorm used by the IR evaluator (gamma=1, beta=0).
Instances For
Evaluate node i given already computed parent values vals.
This is the core “one step” of the denotational semantics:
- lookup the node,
- read its parent values from
vals(using the topo/id invariant), - apply the corresponding spec-layer operation,
- enforce that the produced shape matches the node’s declared
outShape.
This function assumes the graph is structurally well-formed (ids are in bounds and parents are
strictly smaller ids). denoteAll performs that check up front.
Instances For
Evaluate nodes i, i+1, ... given already computed prefix values vals.
This is written as a structurally recursive function so it is easy to reason about in proofs (evaluation is “a simple loop over node ids”).
Instances For
Evaluate a graph to a table of node values.
This returns an array vals of length g.size where vals[i] is the value of node i.
We do a structural well-formedness check once up front (ids/arity/topology). For compiler-produced
graphs, the boolean Graph.wellFormed check is a fast path; if it fails we fall back to the
exception-producing Graph.checkWellFormed so callers get a readable error message.
The evaluator is total in the sense that it always returns either:
.ok vals(all nodes evaluated successfully), or.error msgdescribing the first failure (malformed IR, missing payload, or a local shape error).
Instances For
Scoped notation #
Scoped notation for evaluating a graph to all node values.
Use with:
open scoped IR
g⟦payload, input⟧
Instances For
ASCII alternative to g⟦payload, input⟧.
Instances For
Evaluate the graph and return the value at outputId.