Training-facing TapeM helpers #
The core tape builder lives in NN.Runtime.Autograd.Engine.TapeM; that file owns the operation
vocabulary and reverse-mode execution. This module is intentionally smaller: it contains the
training conveniences that make loss construction read cleanly without defining a second tape API.
The main helpers are:
paramfor trainable leaves (requires_grad := true);constfor data or frozen leaves (requires_grad := false);meanScalarOver,meanScalarOverArray, andmeanScalarOverDatasetfor averaged scalar losses.
Higher derivatives #
Tape.backwardScalar is a first-order reverse pass over a completed tape. It returns gradient
values, but it does not record the backward pass itself as a differentiable graph. So this layer is
the right place for ordinary training losses, not for Hessians or differentiating-through-backward.
For higher derivatives, use the functional autodiff surface in NN.Runtime.Autograd.TorchLean
(hvpInputs, hessian1, and the public API wrappers). That path rebuilds the program over dual
numbers / compiled graph structure and is the correct architecture for JVP-over-VJP style
derivatives.
Create a trainable leaf node.
This is the training alias for Runtime.Autograd.TapeM.leaf with requires_grad := true, matching
the role of a parameter tensor in a PyTorch-style eager tape.
Instances For
Create a constant/data leaf node.
Use this for minibatch inputs, labels, masks, and frozen tensors. The value is still used by the
forward computation, but backwardScalar will not accumulate a gradient for it as a leaf.
Instances For
Compute the mean of a dataset of scalar-valued losses.
lossOf x must return a node id whose value has shape .scalar.
Mean reduction for a list of scalar-valued losses, written in TapeM.
This is a common pattern in training loops: compute a scalar loss per sample, sum, then scale by
1/N.
Instances For
Mean reduction for an array-backed batch.
Arrays are common at runtime boundaries, while meanScalarOver uses lists because foldlM over
lists keeps the implementation and error behavior simple. This wrapper makes that conversion
explicit and keeps call sites tidy.
Instances For
Mean reduction for a Dataset.
This is the natural bridge from Train.Dataset batches to a scalar loss node. It does not shuffle,
batch, or otherwise mutate the dataset; it simply materializes the current dataset order as a list
and delegates to meanScalarOver.
Instances For
Array-batch mean loss is exactly list-batch mean loss after Array.toList.
Dataset-batch mean loss is exactly list-batch mean loss after Dataset.toList.