GraphM Elementwise And Scalar Ops #
Arithmetic, activations, scalar reductions, and MSE loss builders for proof-compiled graphs.
JVP vs VJP in this module
Each compiled node stores both:
vjp: reverse-mode vector-Jacobian product (used by backprop), andjvp: forward-mode Jacobian-vector product (directional derivative).
The .compiled runtime path is primarily exercised via reverse-mode (VJP) and compilation to the
eager tape. Basic elementwise/bilinear ops provide real JVP rules, shape-structural ops (for
example slice/concat) apply the same transformation to the tangent, and heavier ops should expose
named spec-layer JVP helpers before being wired here. Reverse-only ops
it must be listed in reverseOnlyJvpOps and call unsupportedJvp rather than returning a silent
zero tangent.
Forward-mode coverage is expanded by adding concrete jvp rules next to the corresponding
forward and vjp definitions.
Elementwise addition node (y = a + b).
PyTorch comparison: torch.add(a, b).
Instances For
Elementwise subtraction node (y = a - b).
PyTorch comparison: torch.sub(a, b).
Instances For
Elementwise multiplication node (y = a ⊙ b).
PyTorch comparison: torch.mul(a, b).
Instances For
Square x ↦ x ⊙ x.
Instances For
Scale a tensor by a scalar constant c (y = c * x).
PyTorch comparison: c * x / torch.mul(x, c).
Instances For
Elementwise absolute value.
PyTorch comparison: torch.abs(x).
Instances For
Elementwise square root.
PyTorch comparison: torch.sqrt(x).
Instances For
Elementwise clamp to [minVal, maxVal].
PyTorch comparison: torch.clamp(x, min=minVal, max=maxVal).
Instances For
Elementwise maximum.
At ties we split the gradient equally (0.5 / 0.5), matching the tie-handling documented in
the eager tape (NN.Runtime.Autograd.Engine.Core).
PyTorch comparison: torch.maximum(a, b).
Instances For
Elementwise minimum.
At ties we split the gradient equally (0.5 / 0.5).
PyTorch comparison: torch.minimum(a, b).
Instances For
Elementwise ReLU.
PyTorch comparison: torch.nn.functional.relu(x).
Instances For
Elementwise sigmoid. PyTorch comparison: torch.sigmoid(x).
Instances For
Elementwise tanh. PyTorch comparison: torch.tanh(x).
Instances For
Softmax along the last axis (recursing over outer dimensions).
PyTorch comparison: torch.softmax(x, dim=-1).
Instances For
Stable log-softmax along the last axis.
This is a primitive in the compiled graph, not the composition log ∘ softmax, so proof/IR
execution and eager CUDA share the same PyTorch-style numerical contract.
Instances For
Elementwise softplus. PyTorch comparison: torch.nn.functional.softplus(x).
Instances For
Elementwise exponential. PyTorch comparison: torch.exp(x).
Instances For
Elementwise natural logarithm. PyTorch comparison: torch.log(x).
Instances For
Elementwise reciprocal x ↦ 1/x. PyTorch comparison: torch.reciprocal(x).
Instances For
Elementwise numerically-stable log (uses an internal ε).
PyTorch comparison: commonly written torch.log(x + eps).
Instances For
Reduce-sum over all entries, producing a scalar.
PyTorch comparison: torch.sum(x).
Instances For
Mean-squared error loss with "mean" reduction, producing a scalar.
PyTorch comparison: torch.nn.functional.mse_loss(yhat, target, reduction=\"mean\").
Instances For
Affine layer y = W x + b in the compiled graph.
PyTorch comparison: torch.nn.functional.linear / torch.nn.Linear.
The JVP is the usual product rule:
d(Wx+b) = dW*x + W*dx + db.
Instances For
Matrix multiplication ((m×n) @ (n×p) → (m×p)).
PyTorch comparison: torch.matmul.
The JVP is the bilinear product rule d(A @ B) = dA @ B + A @ dB.
Instances For
Batched matrix multiplication (batch×m×n with batch×n×p).
PyTorch comparison: torch.bmm.
The JVP is the batched bilinear product rule d(A @ B) = dA @ B + A @ dB.
Instances For
Concatenate two vectors (dim-0 concat).
PyTorch comparison: torch.cat([a, b], dim=0) for 1D tensors.
Instances For
Concatenate along the leading dimension (dim=0) for tensors of shape .dim n s.
PyTorch comparison: torch.cat([a, b], dim=0).
Instances For
Slice a contiguous range along dim=0.
PyTorch comparison: x[start : start+len] for tensors where the leading dimension is indexed.