Sequence utilities (spec layer) #
These helpers manipulate sequence tensors, typically of shape:
Tensor α (.dim seqLen (.dim featureDim .scalar)).
They are intentionally simple (structural recursion on Tensor.dim) and are used by the
RNN/LSTM/GRU specs.
Why we have a dedicated "sequence" module:
- In the spec layer we write models the way they appear in papers: "for each timestep t, apply ...".
These helpers let us express that directly, without constantly unpacking
Tensor.dim. - Many proofs about RNN-style models want to reason step-by-step. Definitions that recurse
structurally on
Tensortend to be easier to induct over than definitions that go through arrays or ad-hoc lists. - We deliberately keep this file free of runtime concerns (no
IO, no mutable state). When we care about performance, we use the runtime layer; here we care about "the clean mathematical meaning".
PyTorch analogies:
- A tensor of shape
(seqLen, dim)is written here as.dim seqLen (.dim dim .scalar). mapSequenceSpec fis liketorch.vmap(f)over the leading time dimension (conceptually).reduceSumSequenceSpecis likeseq.sum(dim=0)(but we requireseqLen ≠ 0as evidence).
Naming note:
- This file defines
Spec.concatSequenceSpecfor concatenating along the feature dimension (inner axis, liketorch.cat(..., dim=1)for(seqLen, dim)tensors). NN.Spec.Core.TensorReductionShapealso definesSpec.Tensor.concatSequenceSpecfor concatenating along the time dimension (axis 0). The names are similar on purpose (both are "sequence concatenation"), but they are different ops.
References / analogies:
- PyTorch
torch.cat(shape intuition): https://pytorch.org/docs/stable/generated/torch.cat.html - PyTorch
torch.flip(reverse/flip intuition): https://pytorch.org/docs/stable/generated/torch.flip.html
Map a per‑step function over a sequence (vector input → vector output).
Instances For
Map a per‑step function over a scalar sequence, producing vectors.
Instances For
Zip two sequences together, then apply a vector x vector function at each step.
We pass outShape explicitly so the result type is easy for Lean to elaborate at call sites.
This avoids a lot of "stuck" metavariables in larger model definitions.
Instances For
Zip a vector sequence and a scalar sequence, then apply a vector×scalar function per step.
Instances For
Reduce a sequence by summing along the time axis (axis = 0).
We ask for seqLen ≠ 0 because downstream uses often combine this with mean-like operations
or want to avoid degenerate "empty sequence" cases in proofs. (PyTorch defines behavior for
zero-length dims, but we prefer making those cases explicit in the spec layer.)
Instances For
Like reduceSumSequenceSpec, but for 3-D tensors (seqLen × outputSize × hiddenSize).
Instances For
Reverse a sequence along its time axis.
PyTorch analogy: seq.flip(dims=[0]).
Instances For
Concatenate two sequences along the feature dimension.
PyTorch analogy: torch.cat([seq1, seq2], dim=1) when shapes are (seqLen, dim1) and (seqLen, dim2).
Do not confuse this with Spec.Tensor.concatSequenceSpec (defined in
NN.Spec.Core.TensorReductionShape), which concatenates along the time axis (axis 0).
Instances For
Reduce a batch of vectors by summing over the batch axis (explicit axis = 0).
This axis-explicit helper takes an axis argument plus a proof it equals 0.
Prefer reduce_sum_vec2 when the axis is fixed by the surrounding code.
Instances For
Same as reduce_sum_vec, but avoids the explicit axis argument.
We define this with a small Nat loop so the recursion is obvious to Lean and doesn't depend on
List.finRange. It's the same mathematical operation: sum all batch vectors pointwise.