TorchLean API

NN.Examples.Models.Sequence.Mamba

Mamba Text Training #

Runnable byte-level language-model training with the public Mamba API constructor.

The model is trainable end-to-end:

mamba(seqLen, vocab, stateDim) → linear(stateDim → vocab)

and the same code runs on CPU or CUDA through TorchLean autograd.

python3 scripts/datasets/download_example_data.py --tiny-shakespeare
lake exe -K cuda=true torchlean mamba --cuda --tiny-shakespeare --steps 1 --windows 1 --generate 0

CLI subcommand name used in terminal banners and error messages.

Instances For

    Default JSON loss-curve path for this command.

    Instances For

      Training and generation context length for the Mamba text example.

      Instances For

        Byte tokenizer used by this sequence model.

        Instances For

          Mamba text-model configuration shared by shapes and the constructor.

          Instances For
            @[reducible, inline]

            Input shape: one sequence of one-hot byte tokens.

            Instances For
              @[reducible, inline]

              Output shape: one vocabulary-logit row per input position.

              Instances For

                Public Mamba language-model constructor specialized to the example config.

                Instances For

                  Convert a token window into the one-hot next-token sample consumed by the Mamba model.

                  Instances For

                    Build a finite cyclic training set from corpus text, biased toward the prompt when present.

                    Instances For

                      Print the current argmax prediction beside the prompt and shifted target text.

                      Instances For

                        Convert a prompt window into the typed one-hot input tensor used during generation.

                        Instances For
                          def NN.Examples.Models.Sequence.Mamba.generateSampled (predict : TorchLean.Tensor.T Float σIO (TorchLean.Tensor.T Float τ)) (prompt : String) (steps : ) (temperature : Float) (topK seed : ) :

                          Autoregressively extend a prompt using the trained Mamba parameters.

                          Instances For

                            Train the Mamba language model and print before/after prediction and generation reports.

                            Instances For

                              CLI entrypoint for the Mamba text command.

                              Instances For