RNN models (spec) #

This file builds higher‑level RNN architectures by composing module specs (SpecChain), e.g.:

sequence‑to‑sequence: RNN over inputs + per‑step linear projection,
many‑to‑one classification: RNN + classifier head on the final hidden state,
bidirectional variants (where supported by module specs).

The actual cell dynamics live in NN/Spec/Layers/Rnn.lean; this file is "model wiring".

def Spec.simpleRnnModelSpec {α : Type} [Context α] [DecidableRel fun (x1 x2 : α) => x1 > x2] {seqLen inputSize hiddenSize outputSize : ℕ} (rnn_spec : RNNSpec α inputSize hiddenSize) (linearSpec : LinearSpec α hiddenSize outputSize) :

ModSpec.SpecChain α (Shape.dim seqLen (Shape.dim inputSize Shape.scalar)) (Shape.dim seqLen (Shape.dim outputSize Shape.scalar))

A simple sequence-to-sequence RNN "model wiring" expressed as a SpecChain.

This composes an RNNModuleSpec with a per-time-step linear projection (LinearSeqModuleSpec), so the overall model maps:

input shape: [seqLen, inputSize]
output shape: [seqLen, outputSize]

PyTorch analogue: applying nn.RNN (or a custom recurrent cell) followed by a nn.Linear at each time step.

Instances For

source

def Spec.rnnClassifierModelSpec {α : Type} [Context α] [DecidableRel fun (x1 x2 : α) => x1 > x2] {seqLen inputSize hiddenSize numClasses : ℕ} (rnn_spec : RNNSpec α inputSize hiddenSize) (classifier_spec : LinearSpec α hiddenSize numClasses) (h : seqLen ≠ 0) :

ModSpec.SpecChain α (Shape.dim seqLen (Shape.dim inputSize Shape.scalar)) (Shape.dim numClasses Shape.scalar)

A many-to-one RNN classifier expressed as a SpecChain.

This runs an RNN over the input sequence and then applies a linear classifier head to the final hidden state.

PyTorch analogue: nn.RNN (or nn.GRU/nn.LSTM) feeding a nn.Linear head, taking the last hidden/output.

Instances For

source

def Spec.bilstmClassifierSpec {α : Type} [Context α] [DecidableRel fun (x1 x2 : α) => x1 > x2] {seqLen inputSize hiddenSize numClasses : ℕ} (forward_lstm backward_lstm : LSTMSpec α inputSize hiddenSize) (classifier_spec : LinearSpec α (hiddenSize + hiddenSize) numClasses) (h : seqLen ≠ 0) :

ModSpec.SpecChain α (Shape.dim seqLen (Shape.dim inputSize Shape.scalar)) (Shape.dim numClasses Shape.scalar)

A bidirectional LSTM classifier expressed as a SpecChain.

This uses a BiLSTMModuleSpec to combine forward/backward LSTM passes, concatenates the two hidden-state streams, and applies a linear classifier head.

PyTorch analogue: nn.LSTM(bidirectional=true) followed by a nn.Linear classifier.

Instances For

source

def Spec.multilayerRnnSpec {α : Type} [Context α] [DecidableRel fun (x1 x2 : α) => x1 > x2] {seqLen inputSize hiddenSize outputSize : ℕ} (rnn1_spec : RNNSpec α inputSize hiddenSize) (rnn2_spec : RNNSpec α hiddenSize hiddenSize) (linearSpec : LinearSpec α hiddenSize outputSize) :

ModSpec.SpecChain α (Shape.dim seqLen (Shape.dim inputSize Shape.scalar)) (Shape.dim seqLen (Shape.dim outputSize Shape.scalar))

A two-layer RNN encoder with a per-step linear projection, expressed as a SpecChain.

The second recurrent layer consumes the hidden stream of the first.

Instances For

source

def Spec.rnnLanguageModelSpec {α : Type} [Context α] [DecidableRel fun (x1 x2 : α) => x1 > x2] {seqLen vocabSize hiddenSize : ℕ} (embedding_spec : LinearSpec α vocabSize hiddenSize) (rnn_spec : RNNSpec α hiddenSize hiddenSize) (output_spec : LinearSpec α hiddenSize vocabSize) :

ModSpec.SpecChain α (Shape.dim seqLen (Shape.dim vocabSize Shape.scalar)) (Shape.dim seqLen (Shape.dim vocabSize Shape.scalar))

A simple RNN language model spec: "embedding" linear map, RNN core, and output projection.

This file treats embedding/projection as LinearSpecs. A common spec-level usage is that tokens are one-hot vectors of length vocabSize, so the embedding is just a matrix multiply.

PyTorch analogue: nn.Embedding (conceptually) + nn.RNN + nn.Linear vocabulary projection.

Instances For

source

structure Spec.SimpleRNNModel (α : Type) (inputSize hiddenSize outputSize : ℕ) :

Type

Bundle of parameters for a simple single-layer RNN model with a linear output head.

This is a "record of specs" representation, as opposed to the SpecChain representation used above.

rnn : RNNSpec α inputSize hiddenSize
rnn.
output_layer : LinearSpec α hiddenSize outputSize
output layer.

Instances For

source

structure Spec.MultiLayerRNNModel (α : Type) (inputSize hiddenSize outputSize numLayers : ℕ) :

Type

Bundle of parameters for a multi-layer RNN model.

layers is indexed by Fin numLayers and selects the appropriate input size for the first layer versus subsequent layers.

layers (i : Fin numLayers) : RNNSpec α (if ↑i = 0 then inputSize else hiddenSize) hiddenSize
Layer stack.
output_layer : LinearSpec α hiddenSize outputSize
output layer.

Instances For

source

structure Spec.RNNClassifier (α : Type) (inputSize hiddenSize numClasses : ℕ) :

Type

Bundle of parameters for a many-to-one RNN classifier.

The classifier head is a linear layer applied to the final hidden state.

rnn : RNNSpec α inputSize hiddenSize
rnn.
classifier : LinearSpec α hiddenSize numClasses
classifier.

Instances For

source

structure Spec.RNNGenerator (α : Type) (vocabSize hiddenSize : ℕ) :

Type

Bundle of parameters for a many-to-many RNN generator (language-model style).

This includes a (linear) embedding, recurrent core, and output projection back to vocabulary.

embedding : LinearSpec α vocabSize hiddenSize
embedding.
rnn : RNNSpec α hiddenSize hiddenSize
rnn.
output_projection : LinearSpec α hiddenSize vocabSize
output projection.

Instances For

source

structure Spec.BiRNNModel (α : Type) (inputSize hiddenSize outputSize : ℕ) :

Type

Bundle of parameters for a bidirectional RNN model with an output head.

The output head consumes the concatenation of forward and backward hidden states.

forward_rnn : RNNSpec α inputSize hiddenSize
forward rnn.
backward_rnn : RNNSpec α inputSize hiddenSize
backward rnn.
output_layer : LinearSpec α (hiddenSize + hiddenSize) outputSize
output layer.

Instances For

source

def Spec.simpleRnnForward {α : Type} [Context α] {inputSize hiddenSize outputSize : ℕ} (model : SimpleRNNModel α inputSize hiddenSize outputSize) (input : Tensor α (Shape.dim inputSize Shape.scalar)) (hidden : Tensor α (Shape.dim hiddenSize Shape.scalar)) :

Tensor α (Shape.dim outputSize Shape.scalar) × Tensor α (Shape.dim hiddenSize Shape.scalar)

One-step forward pass for SimpleRNNModel.

Given an input vector and current hidden state, compute (output, new_hidden) using the RNN cell and the linear head.

Instances For

source

def Spec.simpleRnnSequenceForward {α : Type} [Context α] {seqLen inputSize hiddenSize outputSize : ℕ} (model : SimpleRNNModel α inputSize hiddenSize outputSize) (inputs : Tensor α (Shape.dim seqLen (Shape.dim inputSize Shape.scalar))) (initial_hidden : Tensor α (Shape.dim hiddenSize Shape.scalar)) (h : 0 < seqLen) :

Tensor α (Shape.dim seqLen (Shape.dim outputSize Shape.scalar)) × Tensor α (Shape.dim hiddenSize Shape.scalar)

Sequence forward pass for SimpleRNNModel.

Runs rnn_sequence_spec over the full sequence, applies the output layer at each time step, and returns both the per-step outputs and the final hidden state.

Instances For

source

def Spec.rnnClassifierForward {α : Type} [Context α] {seqLen inputSize hiddenSize numClasses : ℕ} (model : RNNClassifier α inputSize hiddenSize numClasses) (inputs : Tensor α (Shape.dim seqLen (Shape.dim inputSize Shape.scalar))) (initial_hidden : Tensor α (Shape.dim hiddenSize Shape.scalar)) (h : 0 < seqLen) :

Tensor α (Shape.dim numClasses Shape.scalar)

Forward pass for an RNNClassifier (many-to-one).

This runs the recurrent core over the input sequence and feeds the last hidden state to the classifier head.

Instances For

source

def Spec.rnnGeneratorForward {α : Type} [Context α] {seqLen vocabSize hiddenSize : ℕ} (model : RNNGenerator α vocabSize hiddenSize) (input_tokens : Tensor α (Shape.dim seqLen (Shape.dim vocabSize Shape.scalar))) (initial_hidden : Tensor α (Shape.dim hiddenSize Shape.scalar)) (h : 0 < seqLen) :

Tensor α (Shape.dim seqLen (Shape.dim vocabSize Shape.scalar)) × Tensor α (Shape.dim hiddenSize Shape.scalar)

Forward pass for an RNNGenerator (many-to-many).

This applies an "embedding" linear map to each token, runs the RNN, and projects each hidden state back into vocabulary space.

Instances For

source

def Spec.birnnForward {α : Type} [Context α] {seqLen inputSize hiddenSize outputSize : ℕ} (model : BiRNNModel α inputSize hiddenSize outputSize) (inputs : Tensor α (Shape.dim seqLen (Shape.dim inputSize Shape.scalar))) (forward_hidden backward_hidden : Tensor α (Shape.dim hiddenSize Shape.scalar)) :

Tensor α (Shape.dim seqLen (Shape.dim outputSize Shape.scalar))

Forward pass for a bidirectional RNN model.

This runs a forward RNN on the sequence, a backward RNN on the reversed sequence, concatenates the two state streams per time step, and applies the output head.

Instances For

source

def Spec.multilayerRnnForwardSingle {α : Type} [Context α] {inputSize hiddenSize outputSize : ℕ} (layers : RNNSpec α inputSize hiddenSize) (output_layer : LinearSpec α hiddenSize outputSize) (inputs : Tensor α (Shape.dim inputSize Shape.scalar)) (hidden : Tensor α (Shape.dim hiddenSize Shape.scalar)) :

Tensor α (Shape.dim outputSize Shape.scalar) × Tensor α (Shape.dim hiddenSize Shape.scalar)

One-step helper: run a single RNN cell update and apply an output projection.

This is used by some small examples that do not build a full SpecChain or multi-layer bundle.

Instances For

source

def Spec.simpleRnnBackward {α : Type} [Context α] {seqLen inputSize hiddenSize outputSize : ℕ} (model : SimpleRNNModel α inputSize hiddenSize outputSize) (inputs : Tensor α (Shape.dim seqLen (Shape.dim inputSize Shape.scalar))) (hidden_states : Tensor α (Shape.dim seqLen (Shape.dim hiddenSize Shape.scalar))) (grad_outputs : Tensor α (Shape.dim seqLen (Shape.dim outputSize Shape.scalar))) (h : 0 < seqLen) :

Tensor α (Shape.dim hiddenSize (Shape.dim (inputSize + hiddenSize) Shape.scalar)) × Tensor α (Shape.dim hiddenSize Shape.scalar) × Tensor α (Shape.dim outputSize (Shape.dim hiddenSize Shape.scalar)) × Tensor α (Shape.dim outputSize Shape.scalar) × Tensor α (Shape.dim seqLen (Shape.dim inputSize Shape.scalar))

Backward pass for SimpleRNNModel over a full sequence.

Returns gradients for:

the RNN cell parameters,
the output head parameters, and
the input sequence.

This is a spec-level reference implementation; performance is not a goal here.

Instances For

source

def Spec.map2SequenceSpec {α : Type} {seqLen dim1 dim2 : ℕ} (f : Tensor α (Shape.dim dim1 Shape.scalar) → Tensor α (Shape.dim dim2 Shape.scalar) → α) (seq1 : Tensor α (Shape.dim seqLen (Shape.dim dim1 Shape.scalar))) (seq2 : Tensor α (Shape.dim seqLen (Shape.dim dim2 Shape.scalar))) :

Tensor α (Shape.dim seqLen Shape.scalar)

Map a scalar-valued function over two aligned sequences, producing a sequence of scalars.

This is a lightweight helper used in the example loss definitions below.

Instances For

source

def Spec.sequenceClassificationLoss {α : Type} [Context α] {seqLen numClasses : ℕ} (predictions targets : Tensor α (Shape.dim seqLen (Shape.dim numClasses Shape.scalar))) :

Mean cross-entropy loss over a sequence of class-probability predictions.

This is the spec-level analogue of a per-time-step classification loss, averaged across steps. PyTorch analogue: torch.nn.CrossEntropyLoss applied per step and then averaged.

Instances For

source

def Spec.languageModelingLoss {α : Type} [Context α] {seqLen vocabSize : ℕ} (predictions targets : Tensor α (Shape.dim seqLen (Shape.dim vocabSize Shape.scalar))) :

Language-modeling loss (alias of sequence_classification_loss).

This treats each time step as a vocabulary classification task.

Instances For

source

def Spec.simpleRNNToModuleSpec {α : Type} [Context α] {seqLen inputSize hiddenSize outputSize : ℕ} (model : SimpleRNNModel α inputSize hiddenSize outputSize) (h : 0 < seqLen) :

ModSpec.NNModuleSpec α (Shape.dim seqLen (Shape.dim inputSize Shape.scalar)) (Shape.dim seqLen (Shape.dim outputSize Shape.scalar))

Package a SimpleRNNModel as an NNModuleSpec for use with the module-spec tooling.

The export_func.toPyTorch string is documentation-oriented: it indicates the intended PyTorch analogue, but is not itself an executable exporter.

Instances For

source

def Spec.rnnClassifierToModuleSpec {α : Type} [Context α] {seqLen inputSize hiddenSize numClasses : ℕ} (model : RNNClassifier α inputSize hiddenSize numClasses) (h : 0 < seqLen) :

ModSpec.NNModuleSpec α (Shape.dim seqLen (Shape.dim inputSize Shape.scalar)) (Shape.dim numClasses Shape.scalar)

Package an RNNClassifier as an NNModuleSpec.

As with simpleRNNToModuleSpec, this is primarily used to plug the spec model into the common module pipeline (and to generate a PyTorch-analogue string for documentation).

Instances For

TorchLean API

NN.Spec.Module.RnnModels

RNN models (spec) #