TorchLean API

NN.Spec.Module.RnnModels

RNN models (spec) #

This file builds higher‑level RNN architectures by composing module specs (SpecChain), e.g.:

The actual cell dynamics live in NN/Spec/Layers/Rnn.lean; this file is "model wiring".

def Spec.simpleRnnModelSpec {α : Type} [Context α] [DecidableRel fun (x1 x2 : α) => x1 > x2] {seqLen inputSize hiddenSize outputSize : } (rnn_spec : RNNSpec α inputSize hiddenSize) (linearSpec : LinearSpec α hiddenSize outputSize) :
ModSpec.SpecChain α (Shape.dim seqLen (Shape.dim inputSize Shape.scalar)) (Shape.dim seqLen (Shape.dim outputSize Shape.scalar))

A simple sequence-to-sequence RNN "model wiring" expressed as a SpecChain.

This composes an RNNModuleSpec with a per-time-step linear projection (LinearSeqModuleSpec), so the overall model maps:

  • input shape: [seqLen, inputSize]
  • output shape: [seqLen, outputSize]

PyTorch analogue: applying nn.RNN (or a custom recurrent cell) followed by a nn.Linear at each time step.

Instances For
    def Spec.rnnClassifierModelSpec {α : Type} [Context α] [DecidableRel fun (x1 x2 : α) => x1 > x2] {seqLen inputSize hiddenSize numClasses : } (rnn_spec : RNNSpec α inputSize hiddenSize) (classifier_spec : LinearSpec α hiddenSize numClasses) (h : seqLen 0) :

    A many-to-one RNN classifier expressed as a SpecChain.

    This runs an RNN over the input sequence and then applies a linear classifier head to the final hidden state.

    PyTorch analogue: nn.RNN (or nn.GRU/nn.LSTM) feeding a nn.Linear head, taking the last hidden/output.

    Instances For
      def Spec.bilstmClassifierSpec {α : Type} [Context α] [DecidableRel fun (x1 x2 : α) => x1 > x2] {seqLen inputSize hiddenSize numClasses : } (forward_lstm backward_lstm : LSTMSpec α inputSize hiddenSize) (classifier_spec : LinearSpec α (hiddenSize + hiddenSize) numClasses) (h : seqLen 0) :

      A bidirectional LSTM classifier expressed as a SpecChain.

      This uses a BiLSTMModuleSpec to combine forward/backward LSTM passes, concatenates the two hidden-state streams, and applies a linear classifier head.

      PyTorch analogue: nn.LSTM(bidirectional=true) followed by a nn.Linear classifier.

      Instances For
        def Spec.multilayerRnnSpec {α : Type} [Context α] [DecidableRel fun (x1 x2 : α) => x1 > x2] {seqLen inputSize hiddenSize outputSize : } (rnn1_spec : RNNSpec α inputSize hiddenSize) (rnn2_spec : RNNSpec α hiddenSize hiddenSize) (linearSpec : LinearSpec α hiddenSize outputSize) :
        ModSpec.SpecChain α (Shape.dim seqLen (Shape.dim inputSize Shape.scalar)) (Shape.dim seqLen (Shape.dim outputSize Shape.scalar))

        A two-layer RNN encoder with a per-step linear projection, expressed as a SpecChain.

        The second recurrent layer consumes the hidden stream of the first.

        Instances For
          def Spec.rnnLanguageModelSpec {α : Type} [Context α] [DecidableRel fun (x1 x2 : α) => x1 > x2] {seqLen vocabSize hiddenSize : } (embedding_spec : LinearSpec α vocabSize hiddenSize) (rnn_spec : RNNSpec α hiddenSize hiddenSize) (output_spec : LinearSpec α hiddenSize vocabSize) :

          A simple RNN language model spec: "embedding" linear map, RNN core, and output projection.

          This file treats embedding/projection as LinearSpecs. A common spec-level usage is that tokens are one-hot vectors of length vocabSize, so the embedding is just a matrix multiply.

          PyTorch analogue: nn.Embedding (conceptually) + nn.RNN + nn.Linear vocabulary projection.

          Instances For
            structure Spec.SimpleRNNModel (α : Type) (inputSize hiddenSize outputSize : ) :

            Bundle of parameters for a simple single-layer RNN model with a linear output head.

            This is a "record of specs" representation, as opposed to the SpecChain representation used above.

            • rnn : RNNSpec α inputSize hiddenSize

              rnn.

            • output_layer : LinearSpec α hiddenSize outputSize

              output layer.

            Instances For
              structure Spec.MultiLayerRNNModel (α : Type) (inputSize hiddenSize outputSize numLayers : ) :

              Bundle of parameters for a multi-layer RNN model.

              layers is indexed by Fin numLayers and selects the appropriate input size for the first layer versus subsequent layers.

              • layers (i : Fin numLayers) : RNNSpec α (if i = 0 then inputSize else hiddenSize) hiddenSize

                Layer stack.

              • output_layer : LinearSpec α hiddenSize outputSize

                output layer.

              Instances For
                structure Spec.RNNClassifier (α : Type) (inputSize hiddenSize numClasses : ) :

                Bundle of parameters for a many-to-one RNN classifier.

                The classifier head is a linear layer applied to the final hidden state.

                • rnn : RNNSpec α inputSize hiddenSize

                  rnn.

                • classifier : LinearSpec α hiddenSize numClasses

                  classifier.

                Instances For
                  structure Spec.RNNGenerator (α : Type) (vocabSize hiddenSize : ) :

                  Bundle of parameters for a many-to-many RNN generator (language-model style).

                  This includes a (linear) embedding, recurrent core, and output projection back to vocabulary.

                  • embedding : LinearSpec α vocabSize hiddenSize

                    embedding.

                  • rnn : RNNSpec α hiddenSize hiddenSize

                    rnn.

                  • output_projection : LinearSpec α hiddenSize vocabSize

                    output projection.

                  Instances For
                    structure Spec.BiRNNModel (α : Type) (inputSize hiddenSize outputSize : ) :

                    Bundle of parameters for a bidirectional RNN model with an output head.

                    The output head consumes the concatenation of forward and backward hidden states.

                    • forward_rnn : RNNSpec α inputSize hiddenSize

                      forward rnn.

                    • backward_rnn : RNNSpec α inputSize hiddenSize

                      backward rnn.

                    • output_layer : LinearSpec α (hiddenSize + hiddenSize) outputSize

                      output layer.

                    Instances For
                      def Spec.simpleRnnForward {α : Type} [Context α] {inputSize hiddenSize outputSize : } (model : SimpleRNNModel α inputSize hiddenSize outputSize) (input : Tensor α (Shape.dim inputSize Shape.scalar)) (hidden : Tensor α (Shape.dim hiddenSize Shape.scalar)) :
                      Tensor α (Shape.dim outputSize Shape.scalar) × Tensor α (Shape.dim hiddenSize Shape.scalar)

                      One-step forward pass for SimpleRNNModel.

                      Given an input vector and current hidden state, compute (output, new_hidden) using the RNN cell and the linear head.

                      Instances For
                        def Spec.simpleRnnSequenceForward {α : Type} [Context α] {seqLen inputSize hiddenSize outputSize : } (model : SimpleRNNModel α inputSize hiddenSize outputSize) (inputs : Tensor α (Shape.dim seqLen (Shape.dim inputSize Shape.scalar))) (initial_hidden : Tensor α (Shape.dim hiddenSize Shape.scalar)) (h : 0 < seqLen) :
                        Tensor α (Shape.dim seqLen (Shape.dim outputSize Shape.scalar)) × Tensor α (Shape.dim hiddenSize Shape.scalar)

                        Sequence forward pass for SimpleRNNModel.

                        Runs rnn_sequence_spec over the full sequence, applies the output layer at each time step, and returns both the per-step outputs and the final hidden state.

                        Instances For
                          def Spec.rnnClassifierForward {α : Type} [Context α] {seqLen inputSize hiddenSize numClasses : } (model : RNNClassifier α inputSize hiddenSize numClasses) (inputs : Tensor α (Shape.dim seqLen (Shape.dim inputSize Shape.scalar))) (initial_hidden : Tensor α (Shape.dim hiddenSize Shape.scalar)) (h : 0 < seqLen) :
                          Tensor α (Shape.dim numClasses Shape.scalar)

                          Forward pass for an RNNClassifier (many-to-one).

                          This runs the recurrent core over the input sequence and feeds the last hidden state to the classifier head.

                          Instances For
                            def Spec.rnnGeneratorForward {α : Type} [Context α] {seqLen vocabSize hiddenSize : } (model : RNNGenerator α vocabSize hiddenSize) (input_tokens : Tensor α (Shape.dim seqLen (Shape.dim vocabSize Shape.scalar))) (initial_hidden : Tensor α (Shape.dim hiddenSize Shape.scalar)) (h : 0 < seqLen) :
                            Tensor α (Shape.dim seqLen (Shape.dim vocabSize Shape.scalar)) × Tensor α (Shape.dim hiddenSize Shape.scalar)

                            Forward pass for an RNNGenerator (many-to-many).

                            This applies an "embedding" linear map to each token, runs the RNN, and projects each hidden state back into vocabulary space.

                            Instances For
                              def Spec.birnnForward {α : Type} [Context α] {seqLen inputSize hiddenSize outputSize : } (model : BiRNNModel α inputSize hiddenSize outputSize) (inputs : Tensor α (Shape.dim seqLen (Shape.dim inputSize Shape.scalar))) (forward_hidden backward_hidden : Tensor α (Shape.dim hiddenSize Shape.scalar)) :
                              Tensor α (Shape.dim seqLen (Shape.dim outputSize Shape.scalar))

                              Forward pass for a bidirectional RNN model.

                              This runs a forward RNN on the sequence, a backward RNN on the reversed sequence, concatenates the two state streams per time step, and applies the output head.

                              Instances For
                                def Spec.multilayerRnnForwardSingle {α : Type} [Context α] {inputSize hiddenSize outputSize : } (layers : RNNSpec α inputSize hiddenSize) (output_layer : LinearSpec α hiddenSize outputSize) (inputs : Tensor α (Shape.dim inputSize Shape.scalar)) (hidden : Tensor α (Shape.dim hiddenSize Shape.scalar)) :
                                Tensor α (Shape.dim outputSize Shape.scalar) × Tensor α (Shape.dim hiddenSize Shape.scalar)

                                One-step helper: run a single RNN cell update and apply an output projection.

                                This is used by some small examples that do not build a full SpecChain or multi-layer bundle.

                                Instances For
                                  def Spec.simpleRnnBackward {α : Type} [Context α] {seqLen inputSize hiddenSize outputSize : } (model : SimpleRNNModel α inputSize hiddenSize outputSize) (inputs : Tensor α (Shape.dim seqLen (Shape.dim inputSize Shape.scalar))) (hidden_states : Tensor α (Shape.dim seqLen (Shape.dim hiddenSize Shape.scalar))) (grad_outputs : Tensor α (Shape.dim seqLen (Shape.dim outputSize Shape.scalar))) (h : 0 < seqLen) :
                                  Tensor α (Shape.dim hiddenSize (Shape.dim (inputSize + hiddenSize) Shape.scalar)) × Tensor α (Shape.dim hiddenSize Shape.scalar) × Tensor α (Shape.dim outputSize (Shape.dim hiddenSize Shape.scalar)) × Tensor α (Shape.dim outputSize Shape.scalar) × Tensor α (Shape.dim seqLen (Shape.dim inputSize Shape.scalar))

                                  Backward pass for SimpleRNNModel over a full sequence.

                                  Returns gradients for:

                                  • the RNN cell parameters,
                                  • the output head parameters, and
                                  • the input sequence.

                                  This is a spec-level reference implementation; performance is not a goal here.

                                  Instances For
                                    def Spec.map2SequenceSpec {α : Type} {seqLen dim1 dim2 : } (f : Tensor α (Shape.dim dim1 Shape.scalar)Tensor α (Shape.dim dim2 Shape.scalar)α) (seq1 : Tensor α (Shape.dim seqLen (Shape.dim dim1 Shape.scalar))) (seq2 : Tensor α (Shape.dim seqLen (Shape.dim dim2 Shape.scalar))) :

                                    Map a scalar-valued function over two aligned sequences, producing a sequence of scalars.

                                    This is a lightweight helper used in the example loss definitions below.

                                    Instances For
                                      def Spec.sequenceClassificationLoss {α : Type} [Context α] {seqLen numClasses : } (predictions targets : Tensor α (Shape.dim seqLen (Shape.dim numClasses Shape.scalar))) :
                                      α

                                      Mean cross-entropy loss over a sequence of class-probability predictions.

                                      This is the spec-level analogue of a per-time-step classification loss, averaged across steps. PyTorch analogue: torch.nn.CrossEntropyLoss applied per step and then averaged.

                                      Instances For
                                        def Spec.languageModelingLoss {α : Type} [Context α] {seqLen vocabSize : } (predictions targets : Tensor α (Shape.dim seqLen (Shape.dim vocabSize Shape.scalar))) :
                                        α

                                        Language-modeling loss (alias of sequence_classification_loss).

                                        This treats each time step as a vocabulary classification task.

                                        Instances For
                                          def Spec.simpleRNNToModuleSpec {α : Type} [Context α] {seqLen inputSize hiddenSize outputSize : } (model : SimpleRNNModel α inputSize hiddenSize outputSize) (h : 0 < seqLen) :

                                          Package a SimpleRNNModel as an NNModuleSpec for use with the module-spec tooling.

                                          The export_func.toPyTorch string is documentation-oriented: it indicates the intended PyTorch analogue, but is not itself an executable exporter.

                                          Instances For
                                            def Spec.rnnClassifierToModuleSpec {α : Type} [Context α] {seqLen inputSize hiddenSize numClasses : } (model : RNNClassifier α inputSize hiddenSize numClasses) (h : 0 < seqLen) :

                                            Package an RNNClassifier as an NNModuleSpec.

                                            As with simpleRNNToModuleSpec, this is primarily used to plug the spec model into the common module pipeline (and to generate a PyTorch-analogue string for documentation).

                                            Instances For