TorchLean API

NN.Examples.Models.Sequence.CharGpt

Char-GPT (minGPT-style) Example #

This example mirrors the classic "character-level GPT on a single text file" walkthrough popularized by Andrej Karpathy's minGPT/nanoGPT teaching material:

It uses TorchLean's one-hot token interface (batch × seqLen × vocab) so the whole example stays in the same typed tensor world as the rest of the codebase.

Implementation note: training draws a fresh deterministic random window each step, following the minGPT/nanoGPT batching pattern. The --windows flag is accepted as a corpus-scale hint for shared scripts, but this command does not precompute a fixed window table.

Quick run:

lake build -R -K cuda=true torchlean:exe
lake exe -K cuda=true torchlean chargpt --cuda --tiny-shakespeare --steps 1 --batch 1 --seq-len 1 --generate 0

chargpt is the character-tokenizer teaching path. It rebuilds deterministic training windows from the corpus, so it is not part of the 10-step CUDA check tier. Use gpt2 or text_gpt2 for the compact GPT-style 10-step checks.

CLI subcommand name used in terminal banners and error messages.

Instances For

    Parse corpus flags and return the UTF-8 training text plus remaining CLI arguments.

    Instances For

      Build a deterministic character alphabet from the corpus.

      Instances For

        Default JSON loss-curve path for this command.

        Instances For

          Help text for character-level GPT training.

          Instances For

            Decode token ids for terminal output with control characters escaped.

            Instances For

              Printable-ASCII generation filter used by --ascii-only.

              Instances For
                @[reducible, inline]

                Fitted predictor for a runtime-sized character GPT model.

                Instances For
                  def NN.Examples.Models.Sequence.CharGpt.generateSampledFromIds (batch seqLen vocab : ) (predict : Predictor batch seqLen vocab) (promptIds : List ) (steps : ) (temperature : Float) (topK seed repeatWindow : ) (repeatPenalty : Float) (allowId : Bool := fun (x : ) => true) (padId : := 0) :

                  Autoregressively extend character token ids using a trained CharGPT model.

                  Instances For

                    CLI entrypoint for character-level GPT training and sampling.

                    Instances For