Sequential GraphSpec #

This file defines the sequential authoring surface for GraphSpec.

The important design decision is:

DAG.Model is the canonical general GraphSpec model representation.
Graph ps σ τ is lightweight syntax for the common special case where the model is just a chain of layers.

So Graph is not a competing graph IR. It is a pleasant way to write:

Linear >>> ReLU >>> Linear
Conv >>> ReLU >>> Pool >>> Flatten >>> Linear

and then lower that chain to the general DAG representation when downstream tooling wants one model shape for everything.

GraphSpec as a whole is a typed DSL for describing neural-network computations, with the explicit goal of being usable in two complementary ways:

Reference / proof semantics: interpret the graph as a pure Lean function on tensors (Interp.spec). This is the semantics we want to reason about: shape safety, algebraic identities, equivalence of model refactorings, etc.
Executable semantics: compile the same graph into a backend-generic TorchLean.Program (Compile.torchProgram) so it can run on the TorchLean runtime (which can target eager or compiled execution backends).

Shapes and parameter shapes are part of the graph type. Concretely, a graph is indexed by:

σ τ : Shape — input/output tensor shapes, and
ps : List Shape — the ordered list of parameter tensor shapes the graph expects.

That “parameter interface” is not a convention (like “whatever state_dict() happens to return”); it is baked into the model type. Sequential composition concatenates parameter lists, and evaluation splits them canonically.

Why do this if PyTorch already exists? #

PyTorch is excellent at running and training neural networks:

nn.Module packages parameters and a forward method.
Autograd records an operation tape during execution and provides gradients.
Modern tooling can capture/transform graphs (torch.fx, torch.export) and compile them (torch.compile).

GraphSpec is not trying to replace any of that. Instead, it focuses on the pieces PyTorch does not give us inside Lean:

A machine-checkable mathematical semantics we can use for proofs.
Static shape discipline: shapes appear in the type, not as runtime asserts.
A typed parameter interface: parameter shapes and ordering are explicit, so “which tensor is weight #2?” is not an out-of-band convention.

In practice, the expected workflow is:

use GraphSpec to write down a model architecture in a shape-typed way,
run it via TorchLean for concrete execution/training experiments, and
use Interp.spec and proof libraries to prove properties of the same architecture.

Why do this if TorchLean already exists? #

TorchLean is the runtime and operator layer: it gives us typed tensors, a backend interface, and executable programs (TorchLean.Program) that can run under the autograd/training runtime.

GraphSpec is the architecture/specification layer: it gives us a small typed syntax for model structure that comes with two linked meanings:

a pure semantics (Interp.spec) that is amenable to proofs in Lean, and
a compiler (Compile.torchProgram) that turns the same description into something runnable.

You can write models directly in TorchLean, but then the “thing you reason about” is already in the executable world (monadic references + backend ops). For many proofs, it is much cleaner to reason about a pure function Params → Tensor → Tensor and separately prove that compilation to the runtime preserves that meaning.

In other words:

TorchLean answers: “Given ops, how do we run/train them?”
GraphSpec answers: “How do we describe models so we can both run them and prove things about them?”

Mathematical View For Sequential Chains #

For g : Graph ps σ τ, think of g as denoting a function

⟦g⟧ : Params(ps) → Tensor σ → Tensor τ.

In this file, that semantics is implemented by Interp.spec, and it is defined structurally:

⟦id⟧ params x = x
⟦prim p⟧ params x = p.specFwd params x
⟦g₁ >>> g₂⟧ params x: split params : Params(ps₁ ++ ps₂) into (params₁ : Params ps₁, params₂ : Params ps₂), then compute ⟦g₂⟧ params₂ (⟦g₁⟧ params₁ x).

The compiler Compile.torchProgram follows the same structure, but targets a monadic Torch interface and expects arguments as params ++ [input] (matching TorchLean.NN.Seq.program).

Scope of `Core.lean` #

This file defines only the sequential core:

Primitive — a single typed operation with both a pure spec and a TorchLean implementation.
Graph — sequential composition (>>>) of primitives with a typed parameter list.
Interp.spec — pure interpreter.
Compile.torchProgram — compiler to TorchLean.Program.
LowerToDAG.Graph.toDAGTerm / toDAGModelZeroInit — the bridge from chain syntax to the canonical DAG representation.

For skip connections, shared intermediates, residual adds, or other multi-input nodes, use NN.GraphSpec.DAG directly.

Direction #

GraphSpec is intended to grow into a hygienic “write once, run/prove many” layer:

richer primitive packs (vision, language, classical ML, …),
richer DAG structure (limited control flow where it can be compiled),
verified compilation passes (fusion, constant folding, layout transforms) with proofs that they preserve Interp.spec,
a better parameter/initialization interface (explicit RNG threading, serialization, interop with PyTorch/ONNX exports),
and a library of reusable theorems about common architectures (e.g. invariants of residual blocks, bounds for Lipschitz constants, etc.).

References / citations #

PyTorch nn.Module and graph tooling (torch.fx, torch.export, torch.compile) for the practical “execution/training first” baseline.
Automatic differentiation background: Baydin et al. (2018), “Automatic Differentiation in Machine Learning: a Survey”.
ReLU: Nair & Hinton (2010).

Sequential GraphSpec #

Why do this if PyTorch already exists? #

Why do this if TorchLean already exists? #

Mathematical View For Sequential Chains #

Scope of Core.lean #

Direction #

References / citations #

Core graph language #

Standard primitives (initial op set) #

Lowering: sequential Graph → DAG term/model #

Lowering internals #

List.get lemmas (small, self-contained) #

Primitive embedding: Primitive → DAG.PrimOp #

Building well-typed DAG arguments for a primitive call #

Graph lowering #

Public API #

Deterministic initialization for sequential graphs #

Semantics (sequential core) #

Scope of `Core.lean` #

`List.get` lemmas (small, self-contained) #

Primitive embedding: `Primitive` → `DAG.PrimOp` #