Gnn #
GNN models (spec layer).
We provide a compact 2-layer GCN with a graph-level mean pooling readout.
This file is intentionally "model glue": the actual message-passing math lives in
NN.Spec.Layers.Gnn (in particular Spec.GCNLayerSpec, Spec.gcn_layer_spec,
and Spec.gcn_layer_backward_spec). Here we focus on wiring layers together and documenting
the end-to-end shapes.
Reference (GCN):
- Kipf and Welling, "Semi-Supervised Classification with Graph Convolutional Networks" (2017): https://arxiv.org/abs/1609.02907
PyTorch ecosystem analogies:
torch_geometric.nn.GCNConvfor the layer-level GCN operator,- global mean pooling as in
torch_geometric.nn.global_mean_pool.
2-layer GCN with gradients #
Forward is:
Z₁ = GCN₁(X)(linear message passing)H₁ = ReLU(Z₁)H₂ = GCN₂(H₁)y = mean_nodes(H₂)(graph-level readout)
Diagram (single graph, no batching):
X : (n × inDim)
└─ GCN₁ ─→ Z₁ : (n × hidDim) ─ ReLU ─→ H₁ : (n × hidDim)
└─ GCN₂ ─→ H₂ : (n × outDim)
└─ mean over nodes ─→ y : (outDim)
Backward follows this structure literally:
- "mean nodes" broadcasts the
outDimgradient back ton×outDimand scales by1/n, - each GCN layer uses the matrix-calculus rules in
Spec.gcn_layer_backward_spec, - ReLU gates gradients by
ReLU'.
A 2-layer GCN "model spec" for a fixed graph with n nodes.
GCNLayerSpec packages the per-layer parameters (including the adjacency/normalization choice),
so the model here is just two such layers composed with a nonlinearity and a readout.
- l1 : Spec.GCNLayerSpec n inDim hidDim α
First GCN layer:
inDim → hidDim. - l2 : Spec.GCNLayerSpec n hidDim outDim α
Second GCN layer:
hidDim → outDim.
Instances For
Forward pass for the 2-layer GCN with a graph-level mean pooling readout.
Input:
x : n × inDimnode features.
Output:
y : outDimgraph embedding produced by averaging node embeddings.
The h_n : n > 0 assumption is only used to make the mean pooling well-defined (division by n).
Instances For
Per-layer gradients returned by GCNLayerSpec backward.
This mirrors the tuple returned by Spec.gcn_layer_backward_spec:
dA: gradient w.r.t. the adjacency-like operator used by the layer,dW: gradient w.r.t. the weight matrix,db: gradient w.r.t. the bias vector.
- dA : Spec.Tensor α (Spec.Shape.dim n (Spec.Shape.dim n Spec.Shape.scalar))
d A.
- dW : Spec.Tensor α (Spec.Shape.dim inDim (Spec.Shape.dim outDim Spec.Shape.scalar))
d W.
- db : Spec.Tensor α (Spec.Shape.dim outDim Spec.Shape.scalar)
db.
Instances For
Gradients for both layers of GCN2Spec.
- l1 : GCNLayerGrads n inDim hidDim α
l 1.
- l2 : GCNLayerGrads n hidDim outDim α
l 2.
Instances For
Backward/VJP for GCN2Spec.forward.
This is written in the same "spec style" as the rest of TorchLean:
- recompute small intermediates instead of depending on runtime caches,
- apply VJPs in reverse order,
- keep shapes explicit.
PyTorch analogy: this corresponds to what autograd would do for a graph built from
GCNConv → ReLU → GCNConv → global_mean_pool, but expressed as a pure function.