Sequence utilities (spec layer) #

These helpers manipulate sequence tensors, typically of shape:

Tensor α (.dim seqLen (.dim featureDim .scalar)).

They are intentionally simple (structural recursion on Tensor.dim) and are used by the RNN/LSTM/GRU specs.

Why we have a dedicated "sequence" module:

In the spec layer we write models the way they appear in papers: "for each timestep t, apply ...". These helpers let us express that directly, without constantly unpacking Tensor.dim.
Many proofs about RNN-style models want to reason step-by-step. Definitions that recurse structurally on Tensor tend to be easier to induct over than definitions that go through arrays or ad-hoc lists.
We deliberately keep this file free of runtime concerns (no IO, no mutable state). When we care about performance, we use the runtime layer; here we care about "the clean mathematical meaning".

PyTorch analogies:

A tensor of shape (seqLen, dim) is written here as .dim seqLen (.dim dim .scalar).
mapSequenceSpec f is like torch.vmap(f) over the leading time dimension (conceptually).
reduceSumSequenceSpec is like seq.sum(dim=0) (but we require seqLen ≠ 0 as evidence).

Naming note:

This file defines Spec.concatSequenceSpec for concatenating along the feature dimension (inner axis, like torch.cat(..., dim=1) for (seqLen, dim) tensors).
NN.Spec.Core.TensorReductionShape also defines Spec.Tensor.concatSequenceSpec for concatenating along the time dimension (axis 0). The names are similar on purpose (both are "sequence concatenation"), but they are different ops.

References / analogies:

PyTorch torch.cat (shape intuition): https://pytorch.org/docs/stable/generated/torch.cat.html
PyTorch torch.flip (reverse/flip intuition): https://pytorch.org/docs/stable/generated/torch.flip.html

source

def Spec.mapSequenceSpec {α : Type} {seqLen inDim outDim : ℕ} (f : Tensor α (Shape.dim inDim Shape.scalar) → Tensor α (Shape.dim outDim Shape.scalar)) (seq : Tensor α (Shape.dim seqLen (Shape.dim inDim Shape.scalar))) :

Tensor α (Shape.dim seqLen (Shape.dim outDim Shape.scalar))

Map a per‑step function over a sequence (vector input → vector output).

Instances For

source

def Spec.mapSequenceVecScalarSpec {α : Type} {seqLen dim : ℕ} (f : Tensor α Shape.scalar → Tensor α (Shape.dim dim Shape.scalar)) (seq : Tensor α (Shape.dim seqLen Shape.scalar)) :

Tensor α (Shape.dim seqLen (Shape.dim dim Shape.scalar))

Map a per‑step function over a scalar sequence, producing vectors.

Instances For

source

def Spec.map2SequenceSpec2 {α : Type} {seqLen dim1 dim2 : ℕ} (outShape : Shape) (f : Tensor α (Shape.dim dim1 Shape.scalar) → Tensor α (Shape.dim dim2 Shape.scalar) → Tensor α outShape) (seq1 : Tensor α (Shape.dim seqLen (Shape.dim dim1 Shape.scalar))) (seq2 : Tensor α (Shape.dim seqLen (Shape.dim dim2 Shape.scalar))) :

Tensor α (Shape.dim seqLen outShape)

Zip two sequences together, then apply a vector x vector function at each step.

We pass outShape explicitly so the result type is easy for Lean to elaborate at call sites. This avoids a lot of "stuck" metavariables in larger model definitions.

Instances For

source

def Spec.map2SequenceVecScalarSpec {α : Type} {seqLen dim : ℕ} (outShape : Shape) (f : Tensor α (Shape.dim dim Shape.scalar) → Tensor α Shape.scalar → Tensor α outShape) (seq1 : Tensor α (Shape.dim seqLen (Shape.dim dim Shape.scalar))) (seq2 : Tensor α (Shape.dim seqLen Shape.scalar)) :

Tensor α (Shape.dim seqLen outShape)

Zip a vector sequence and a scalar sequence, then apply a vector×scalar function per step.

Instances For

source

def Spec.reduceSumSequenceSpec {α : Type} [Context α] {seqLen dim : ℕ} (seq : Tensor α (Shape.dim seqLen (Shape.dim dim Shape.scalar))) (h : seqLen ≠ 0) :

Tensor α (Shape.dim dim Shape.scalar)

Reduce a sequence by summing along the time axis (axis = 0).

We ask for seqLen ≠ 0 because downstream uses often combine this with mean-like operations or want to avoid degenerate "empty sequence" cases in proofs. (PyTorch defines behavior for zero-length dims, but we prefer making those cases explicit in the spec layer.)

Instances For

source

def Spec.reduceSumSequenceSpec2 {α : Type} [Context α] {seqLen outputSize hiddenSize : ℕ} (seq : Tensor α (Shape.dim seqLen (Shape.dim outputSize (Shape.dim hiddenSize Shape.scalar)))) (h : seqLen ≠ 0) :

Tensor α (Shape.dim outputSize (Shape.dim hiddenSize Shape.scalar))

Like reduceSumSequenceSpec, but for 3-D tensors (seqLen × outputSize × hiddenSize).

Instances For

source

def Spec.reverseSequenceSpec {α : Type} {seqLen dim : ℕ} (seq : Tensor α (Shape.dim seqLen (Shape.dim dim Shape.scalar))) :

Tensor α (Shape.dim seqLen (Shape.dim dim Shape.scalar))

Reverse a sequence along its time axis.

PyTorch analogy: seq.flip(dims=[0]).

Instances For

source

def Spec.concatSequenceSpec {α : Type} {seqLen dim1 dim2 : ℕ} (seq1 : Tensor α (Shape.dim seqLen (Shape.dim dim1 Shape.scalar))) (seq2 : Tensor α (Shape.dim seqLen (Shape.dim dim2 Shape.scalar))) :

Tensor α (Shape.dim seqLen (Shape.dim (dim1 + dim2) Shape.scalar))

Concatenate two sequences along the feature dimension.

PyTorch analogy: torch.cat([seq1, seq2], dim=1) when shapes are (seqLen, dim1) and (seqLen, dim2).

Do not confuse this with Spec.Tensor.concatSequenceSpec (defined in NN.Spec.Core.TensorReductionShape), which concatenates along the time axis (axis 0).

Instances For

source

def Spec.reduceSumVec {α : Type} [Add α] [Zero α] {inDim batch : ℕ} (axis : ℕ) (t : Tensor α (Shape.dim batch (Shape.dim inDim Shape.scalar))) :

axis = 0 → Tensor α (Shape.dim inDim Shape.scalar)

Reduce a batch of vectors by summing over the batch axis (explicit axis = 0).

This axis-explicit helper takes an axis argument plus a proof it equals 0. Prefer reduce_sum_vec2 when the axis is fixed by the surrounding code.

Instances For

source

def Spec.reduceSumVec2 {α : Type} [Add α] [Zero α] {batch inDim : ℕ} (t : Tensor α (Shape.dim batch (Shape.dim inDim Shape.scalar))) :

Tensor α (Shape.dim inDim Shape.scalar)

Same as reduce_sum_vec, but avoids the explicit axis argument.

We define this with a small Nat loop so the recursion is obvious to Lean and doesn't depend on List.finRange. It's the same mathematical operation: sum all batch vectors pointwise.

Instances For

source

@[irreducible]

def Spec.reduceSumVec2.loop {α : Type} [Add α] {batch inDim : ℕ} (batchFn : Fin batch → Tensor α (Shape.dim inDim Shape.scalar)) (i : ℕ) (acc : Tensor α (Shape.dim inDim Shape.scalar)) :

Tensor α (Shape.dim inDim Shape.scalar)

Instances For

TorchLean API

NN.Spec.Core.Sequence

Sequence utilities (spec layer) #