TorchLean API

Docs Home Guide Examples Graphs

NN.API.Text.Core

API Text #

Text and NLP helpers for TorchLean examples.

TorchLean’s executable runtime expects inputs as floating tensors, so runtime and autograd code can handle them with the same typed tensor APIs as parameters. For language models this means we commonly represent token ids as one-hot / token-distribution tensors of shape:

(batch × seqLen × vocab)

and implement “token embeddings” as a matrix multiply against an embedding table.

This module provides:

a tokenizer interface (with a byte-level tokenizer),
helpers to turn token streams into one-hot tensors,
“next-token prediction” sample builders used by GPT-style examples,
display helpers for turning model logits back into readable token predictions.

Tokenizers #

structure NN.API.text.Tokenizer :

Tokenizer interface (encode/decode).

vocabSize : ℕ
Vocabulary size (token ids are expected to be in [0, vocabSize)).
encode : String → List ℕ
Encode a string into token ids.
decode : List ℕ → String
Decode token ids back into a string.

Instances For

def NN.API.text.Tokenizer.byteArrayOfIds (ids : List ℕ) :

Convert token ids to bytes, truncating each id modulo 256.

Instances For

def NN.API.text.Tokenizer.decodeByteIds (ids : List ℕ) :

Decode byte tokens as UTF-8 when possible, falling back to a byte-wise display mode for generated byte streams that are not valid UTF-8. For valid UTF-8 strings, decode (encode s) = s; model output remains printable even when the byte stream is invalid UTF-8.

Instances For

def NN.API.text.Tokenizer.byte :

Byte-level UTF-8 tokenizer: each byte is one token in [0,256).

Instances For

def NN.API.text.Tokenizer.ofAlphabet (alphabet : Array Char) (unkId : ℕ := 0) (unkChar : Char := '?') :

Build a character-level tokenizer from an explicit alphabet.

This is the TorchLean analogue of the stoi/itos tables used in character-level GPT examples (including Karpathy's "char-gpt" / minGPT walkthroughs): encode maps characters to ids 0..alphabet.size-1, and decode maps ids back to characters.

Notes:

This tokenizer is deterministic given alphabet; callers are responsible for choosing how to construct the alphabet (e.g. sorted(set(data))).
Characters not present in the alphabet map to unkId (default 0), so encode is total.
Ids outside [0, vocabSize) decode to the unkChar (default ?).

Instances For

def NN.API.text.Tokenizer.encodeVec (t : Tokenizer) (n : ℕ) (s : String) (padId : ℕ := 0) :

Encode and pad/truncate to a fixed length, returning a length-indexed Vector.

Instances For

def NN.API.text.Tokenizer.encodeBatchVec (t : Tokenizer) (batch seqLen : ℕ) (ss : List String) (padId : ℕ := 0) :

Vector (Vector ℕ seqLen) batch

Encode a batch of strings, padding/truncating each to length seqLen.

Instances For

One-Hot Token Tensors #

def NN.API.text.oneHotTokenFloat (vocab tokenId : ℕ) :

Tensor.Tensor Float (Shape.Vec vocab)

One-hot vector for a single token id (Vec vocab). Out-of-range ids map to all-zeros.

Instances For

def NN.API.text.tokensToOneHotMatFloat {seqLen vocab : ℕ} (tokens : Vector ℕ seqLen) :

Tensor.Tensor Float (Shape.Mat seqLen vocab)

One-hot encode a fixed-length token sequence as a matrix (seqLen × vocab).

Instances For

def NN.API.text.tokensToOneHotBatchFloat {batch seqLen vocab : ℕ} (tokens : Vector (Vector ℕ seqLen) batch) :

Tensor.Tensor Float (Spec.Shape.dim batch (Shape.Mat seqLen vocab))

One-hot encode a fixed-size batch of token sequences as (batch × seqLen × vocab).

Instances For

Causal LM Samples #

def NN.API.text.causalLmXYOneHotMatFloat (seqLen vocab : ℕ) (tokens : List ℕ) (padId : ℕ := 0) :

Tensor.Tensor Float (Shape.Mat seqLen vocab) × Tensor.Tensor Float (Shape.Mat seqLen vocab)

Build a (x, y) pair for next-token prediction from a token stream.

x[t] = oneHot(tokens[t]) y[t] = oneHot(tokens[t+1])

If the stream is too short, we pad with padId.

Instances For

def NN.API.text.causalLmXYOneHotBatchRowsFloat (batch seqLen vocab : ℕ) (tokensAt : Fin batch → List ℕ) (padId : ℕ := 0) :

Tensor.Tensor Float (Spec.Shape.dim batch (Shape.Mat seqLen vocab)) × Tensor.Tensor Float (Spec.Shape.dim batch (Shape.Mat seqLen vocab))

Build a batched causal-LM (x, y) pair from one token window per batch row.

This is the text analogue of image/tabular minibatching:

row i receives its own token window tokensAt i;
x[i,t] is tokensAt i[t];
y[i,t] is tokensAt i[t+1];
short rows are padded with padId.

GPT-style examples share this batching logic. The contract is explicit: a text batch is a typed tensor of shape (batch, seqLen, vocab), just like the vision loader collates rows into (batch, C, H, W).

Instances For

def NN.API.text.causalLmXOneHotBatch {α : Type} [Semantics.Scalar α] [Runtime.Scalar α] (batch seqLen vocab : ℕ) (tokens : List ℕ) (padId : ℕ := 0) :

Tensor.Tensor α (Spec.Shape.dim batch (Shape.Mat seqLen vocab))

One-hot encode a causal-LM input window as a batched tensor.

Token ids are read from tokens, missing positions use padId, and every batch row receives the same window. Use causalLmSampleOneHotBatchRows when rows should come from different corpus offsets.

Instances For

def NN.API.text.causalLmXOneHotBatchRows {α : Type} [Semantics.Scalar α] [Runtime.Scalar α] (batch seqLen vocab : ℕ) (tokensAt : Fin batch → List ℕ) (padId : ℕ := 0) :

Tensor.Tensor α (Spec.Shape.dim batch (Shape.Mat seqLen vocab))

One-hot encode one causal-LM input window per batch row.

This is the input-only companion to causalLmSampleOneHotBatchRows, used by generation code that has prefixes but no shifted training targets.

Instances For

def NN.API.text.causalLmSampleOneHotBatch {α : Type} [Semantics.Scalar α] [Runtime.Scalar α] (batch seqLen vocab : ℕ) (tokens : List ℕ) (padId : ℕ := 0) :

SupervisedSample α (Spec.Shape.dim batch (Shape.Mat seqLen vocab)) (Spec.Shape.dim batch (Shape.Mat seqLen vocab))

Build a batched supervised next-token sample from a token stream.

The target is shifted by one position: x[t] = tokens[t], y[t] = tokens[t+1]. Every batch row receives the same window, which is useful for prompt evaluation, deterministic checks, and synthetic sequence tasks.

Instances For

def NN.API.text.causalLmSampleOneHotBatchRows {α : Type} [Semantics.Scalar α] [Runtime.Scalar α] (batch seqLen vocab : ℕ) (tokensAt : Fin batch → List ℕ) (padId : ℕ := 0) :

SupervisedSample α (Spec.Shape.dim batch (Shape.Mat seqLen vocab)) (Spec.Shape.dim batch (Shape.Mat seqLen vocab))

Build a batched supervised causal-LM sample from one token window per batch row.

Use this for GPT-style minibatches with distinct corpus windows. causalLmSampleOneHotBatch remains useful when every batch row should repeat a fixed prompt or synthetic sequence.

Instances For

Byte-Corpus Windows #

def NN.API.text.byteAtD (bytes : ByteArray) (i : ℕ) (padId : ℕ := 0) :

Read one byte token from a raw corpus, returning padId past the end.

This is byte-level rather than BPE-level: examples can train causal language models directly from a text file without depending on an external tokenizer artifact. GPT-2 BPE support lives in NN.API.Text.Bpe.

Instances For

def NN.API.text.byteTokenWindow (bytes : ByteArray) (n : ℕ) (offset padId : ℕ := 0) :

Extract a fixed-length byte-token window from a raw corpus.

offset is measured in bytes, not Unicode characters. That is the right behavior for byte-level causal language modeling and avoids hidden UTF-8 slicing assumptions.

Instances For

Corpus Helpers #

def NN.API.text.Corpus.readUtf8File (exeName : String) (path : System.FilePath) (missingHint : String) :

Read a UTF-8 text file with a caller-supplied preparation hint.

The examples pass their executable name and a concrete hint so failures point users to the exact download or conversion command for that dataset.

Instances For

def NN.API.text.Corpus.readByteFile (exeName : String) (path : System.FilePath) (allowSmallData : Bool) (minBytes seqLen : ℕ) :

Read a raw byte corpus and optionally enforce a minimum size.

allowSmallData is an explicit override for bounded local runs. Corpus-training commands can set minBytes to the scale they expect and require users to acknowledge smaller local files.

Instances For

partial def NN.API.text.Corpus.takeUtf8Input (exeName : String) (defaultPath : System.FilePath) (aliases : List (String × System.FilePath)) (missingHint : String) :

List String → IO (String × List String)

Parse a text-corpus flag set and return (text, remainingArgs).

Supported forms:

--data-file PATH
any named alias in aliases, such as ("--tiny-shakespeare", path)
no data flag, which uses defaultPath

def NN.API.text.Corpus.byteOffset (bytes : ByteArray) (i seqLen : ℕ) :

Deterministic sliding-window offset for a byte corpus.

Instances For

def NN.API.text.Corpus.tokenOffset (tokens : Array ℕ) (i seqLen : ℕ) :

Deterministic sliding-window offset for an already-tokenized corpus.

Instances For

def NN.API.text.Corpus.usableTokenStarts (tokenCount seqLen : ℕ) :

Number of legal start positions for a (seqLen + 1) next-token window.

We return at least one start position so bounded corpora stay total; callers can still enforce a minimum corpus size before training.

Instances For

def NN.API.text.Corpus.tokenArrayWindow (tokens : Array ℕ) (n offset : ℕ) (padId : ℕ := 0) :

Extract a fixed token window from an array-backed token corpus.

Instances For

def NN.API.text.Corpus.randomBatchOffsets (tokenCount seqLen batch seed step : ℕ) :

Fin batch → ℕ

Deterministic minGPT-style random offsets for one training batch.

The result is a function Fin batch → Nat: one corpus start offset per row. We derive the random key from (seed, step) and then draw row offsets by the row index, so the run is reproducible without using ambient IO randomness. This is the text equivalent of a shuffled DataLoader epoch.

Instances For

def NN.API.text.Corpus.randomBatchTokenWindows (tokens : Array ℕ) (batch seqLen seed step : ℕ) (padId : ℕ := 0) :

Fin batch → List ℕ

Build token windows for one deterministic random text batch.

Each row gets seqLen + 1 ids so downstream causal-LM helpers can form both x and shifted y. The helper is token-array based, so byte, character, BPE, and synthetic tokenizers can all produce an Array Nat and reuse the same batching semantics.

Instances For

def NN.API.text.Corpus.startsWithAt (xs pat : Array ℕ) (off : ℕ) :

Check whether pat occurs in xs at offset off.

Instances For

def NN.API.text.Corpus.findWindow? (xs pat : Array ℕ) :

Find the first offset where pat appears in xs.

Instances For

def NN.API.text.Corpus.promptAwareOffsets (tokenCount seqLen windows : ℕ) (promptOffset? : Option ℕ) :

Choose training-window offsets, biased toward a prompt occurrence when the corpus contains it.

If the prompt is present in the corpus, a portion of the sampled windows covers nearby text. That keeps generation reports tied to text the model actually saw during training.

Instances For

Causal LM Display Helpers #

def NN.API.text.tokenWindow (t : Tokenizer) (n : ℕ) (input : String) (offset padId : ℕ := 0) :

Return a fixed-length token window from a text string.

offset = 0 is the model prompt window; offset = 1 is the usual next-token target window for causal language modeling. Missing tokens are padded with padId, matching causalLmXYOneHotMatFloat.

Instances For

def NN.API.text.decodeWindow (t : Tokenizer) (n : ℕ) (input : String) (offset padId : ℕ := 0) :

Decode a fixed token window extracted by tokenWindow.

Instances For

def NN.API.text.escapeForDisplay (s : String) :

Escape a short text fragment for one-line terminal output.

Display-only: this does not change tokenizer semantics. It keeps examples readable when a predicted byte sequence contains quotes, backslashes, tabs, or newlines.

Instances For

Sampling Helpers (Top-k) #

structure NN.API.text.GenerationOptions :

Shared text-generation flags for GPT-style examples.

prompt : String
Prompt used to seed autoregressive generation.
generate : ℕ
Number of new tokens to append.
temperature : Float
Softmax temperature. Must be positive.
topK : ℕ
Top-k cutoff. 1 gives greedy decoding.
repeatPenalty : Float
Penalty subtracted for repeated recent tokens. 0 disables it.
repeatWindow : ℕ
Number of recent tokens considered by the repeat penalty. 0 disables the window.
seed : ℕ
Deterministic RNG seed for sampling.
asciiOnly : Bool
Restrict generated ids to a model-specific ASCII allow-list.

Instances For

@[implicit_reducible]

instance NN.API.text.instReprGenerationOptions :

Repr GenerationOptions

def NN.API.text.instReprGenerationOptions.repr :

GenerationOptions → ℕ → Std.Format

Instances For

structure NN.API.text.GenerationDefaults :

Defaults for parseGenerationOptions.

prompt : String
generate : ℕ
temperature : Float
topK : ℕ
repeatPenalty : Float
repeatWindow : ℕ
seed : ℕ
asciiOnly : Bool

Instances For

def NN.API.text.instReprGenerationDefaults.repr :

GenerationDefaults → ℕ → Std.Format

Instances For

@[implicit_reducible]

instance NN.API.text.instReprGenerationDefaults :

Repr GenerationDefaults

def NN.API.text.parseAsciiOnlyFlag (exeName : String) (args : List String) :

Except String (Bool × List String)

Parse --ascii-only, accepting either a bare flag or true/false value.

Instances For

def NN.API.text.parseGenerationOptions (exeName : String) (args : List String) (defaults : GenerationDefaults := { }) :

Except String (GenerationOptions × List String)

Parse the generation flags shared by GPT-style examples.

The model file still owns its training/data flags. This helper only handles prompt, sampling, repeat penalty, deterministic seed, and ASCII restriction.

Instances For

def NN.API.text.GenerationOptions.toDefaults (opts : GenerationOptions) :

GenerationDefaults

Instances For

def NN.API.text.GenerationOptions.parse (exeName : String) (args : List String) (defaults : GenerationOptions) :

Except String (GenerationOptions × List String)

Parse generation flags using a full GenerationOptions value as defaults.

This is the public API shape used by model commands: they provide a concrete default prompt and sampling policy, and the shared parser handles the stable CLI surface.

Instances For

Text Workflow Option Records #

structure NN.API.text.TextCorpusOptions :

Required text-corpus path plus the explicit small-data option used by local corpus trainers.

dataFile : System.FilePath
UTF-8 or raw-byte corpus path selected by --data-file.
allowSmallData : Bool
Allow local runs below the normal corpus-size floor.

Instances For

def NN.API.text.instReprTextCorpusOptions.repr :

TextCorpusOptions → ℕ → Std.Format

Instances For

@[implicit_reducible]

instance NN.API.text.instReprTextCorpusOptions :

Repr TextCorpusOptions

def NN.API.text.TextCorpusOptions.parse (exeName : String) (args : List String) :

Except String (TextCorpusOptions × List String)

Parse the required --data-file corpus flag and optional --allow-small-data switch.

Instances For

structure NN.API.text.TextCorpusPathOptions :

Optional text-corpus path selected by --data-file, with caller-supplied default.

path : System.FilePath
Local text corpus path.

Instances For

@[implicit_reducible]

instance NN.API.text.instReprTextCorpusPathOptions :

Repr TextCorpusPathOptions

def NN.API.text.instReprTextCorpusPathOptions.repr :

TextCorpusPathOptions → ℕ → Std.Format

Instances For

def NN.API.text.TextCorpusPathOptions.parse (args : List String) (defaultPath : System.FilePath) :

Except String (TextCorpusPathOptions × List String)

Parse an optional --data-file flag using the supplied default path.

Instances For

structure NN.API.text.FinetuneOptions :

Optional second corpus pass after the main training run.

finetuneFile? : Option System.FilePath
Optional corpus used for a second fine-tuning pass.
finetuneSteps : ℕ
Number of optimizer steps used on that second corpus when present.

Instances For

@[implicit_reducible]

instance NN.API.text.instReprFinetuneOptions :

Repr FinetuneOptions

def NN.API.text.instReprFinetuneOptions.repr :

FinetuneOptions → ℕ → Std.Format

Instances For

def NN.API.text.FinetuneOptions.parse (args : List String) (defaultSteps : ℕ) :

Except String (FinetuneOptions × List String)

Parse the optional --finetune-file / --finetune-steps pair.

The caller supplies the default step count so commands can reuse their main training-step default.

Instances For

structure NN.API.text.BpeCorpusOptions :

Optional GPT-2 BPE tokenizer bundle plus an optional bounded-text cap.

bpeVocab? : Option System.FilePath
Optional GPT-2 vocab.json path. Must be paired with bpeMerges?.
bpeMerges? : Option System.FilePath
Optional GPT-2 merges.txt path. Must be paired with bpeVocab?.
maxChars? : Option ℕ
Optional text-character cap for bounded local BPE runs.

Instances For

@[implicit_reducible]

instance NN.API.text.instReprBpeCorpusOptions :

Repr BpeCorpusOptions

def NN.API.text.instReprBpeCorpusOptions.repr :

BpeCorpusOptions → ℕ → Std.Format

Instances For

def NN.API.text.BpeCorpusOptions.parse (args : List String) :

Except String (BpeCorpusOptions × List String)

Parse the optional GPT-2 BPE tokenizer bundle.

--bpe-vocab and --bpe-merges must appear together; --max-chars is independent.

Instances For

structure NN.API.text.InteractiveOptions :

Shared terminal-REPL toggle used by interactive text examples.

interactive : Bool
Keep the trained model alive and read prompts from stdin.

Instances For

@[implicit_reducible]

instance NN.API.text.instReprInteractiveOptions :

Repr InteractiveOptions

def NN.API.text.instReprInteractiveOptions.repr :

InteractiveOptions → ℕ → Std.Format

Instances For

def NN.API.text.InteractiveOptions.parse (args : List String) :

Except String (InteractiveOptions × List String)

Parse the shared --interactive flag used by text examples with a terminal prompt loop.

Instances For

structure NN.API.text.PromptGenerationOptions :

Shared prompt plus continuation-length options for simple text-generation commands.

prompt : String
Prompt used for before/after reports and generation.
generate : ℕ
Number of generated tokens or characters after training.

Instances For

@[implicit_reducible]

instance NN.API.text.instReprPromptGenerationOptions :

Repr PromptGenerationOptions

def NN.API.text.instReprPromptGenerationOptions.repr :

PromptGenerationOptions → ℕ → Std.Format

Instances For

def NN.API.text.PromptGenerationOptions.parse (args : List String) (defaults : PromptGenerationOptions) :

Except String (PromptGenerationOptions × List String)

Parse the shared --prompt / --generate flags.

Instances For

Text TrainLog Notes #

def NN.API.text.generationNotes (gen : GenerationOptions) (generated? : Option String := none) (extra : Array String := #[]) :

TrainLog note fields for generation-capable text commands.

The stable generation surface is prompt, continuation length, temperature/top-k, repetition control, RNG seed, and ASCII-only filtering. Model commands can prepend dataset or architecture notes through extra.

Instances For

def NN.API.text.promptGenerationNotes (gen : PromptGenerationOptions) (generated? : Option String := none) (extra : Array String := #[]) :

TrainLog note fields for prompt-based text commands that do not expose the full sampling surface.

Instances For

def NN.API.text.writeGenerationTrainLog (log : Runtime.Training.LogDestination) (title : String) (steps : ℕ) (loss0 loss1 : Float) (gen : GenerationOptions) (generated? : Option String := none) (extra : Array String := #[]) :

Write a before/after loss log for a generation-capable text training command.

Instances For

def NN.API.text.writePromptTrainLog (log : Runtime.Training.LogDestination) (title : String) (steps : ℕ) (loss0 loss1 : Float) (gen : PromptGenerationOptions) (generated? : Option String := none) (extra : Array String := #[]) :

Write a before/after loss log for a prompt-based text training command.

Instances For

structure NN.API.text.SavedParamsGenerationOptionsextends NN.API.text.GenerationOptions :

Shared "load one parameter pack, then sample" option surface.

prompt : String
generate : ℕ
temperature : Float
topK : ℕ
repeatPenalty : Float
repeatWindow : ℕ
seed : ℕ
asciiOnly : Bool
paramsPath : System.FilePath
JSON bits checkpoint loaded before sampling starts.

Instances For

@[implicit_reducible]

instance NN.API.text.instReprSavedParamsGenerationOptions :

Repr SavedParamsGenerationOptions

def NN.API.text.instReprSavedParamsGenerationOptions.repr :

SavedParamsGenerationOptions → ℕ → Std.Format

Instances For

def NN.API.text.SavedParamsGenerationOptions.parse (exeName : String) (args : List String) (defaults : GenerationOptions) :

Except String (SavedParamsGenerationOptions × List String)

Parse the shared saved-parameter sampling flags used by inference-only text commands.

Instances For

Text Training Option Combinators #

structure NN.API.text.LoggedInteractiveOptionsextends NN.API.Common.LoggedTrainFlags, NN.API.text.InteractiveOptions :

Logged-training options plus the terminal-REPL toggle.

Instances For

@[implicit_reducible]

instance NN.API.text.instReprLoggedInteractiveOptions :

Repr LoggedInteractiveOptions

def NN.API.text.instReprLoggedInteractiveOptions.repr :

LoggedInteractiveOptions → ℕ → Std.Format

Instances For

def NN.API.text.mkLoggedInteractiveOptions (train : Common.LoggedTrainFlags) (interactive : InteractiveOptions) :

LoggedInteractiveOptions

Build the shared logged-training + interactive option record.

Instances For

structure NN.API.text.InteractiveTrainOptionsextends NN.API.Common.ModelTrainFlags, NN.API.text.InteractiveOptions :

Standard training flags plus the terminal-REPL toggle.

Instances For

def NN.API.text.instReprInteractiveTrainOptions.repr :

InteractiveTrainOptions → ℕ → Std.Format

Instances For

@[implicit_reducible]

instance NN.API.text.instReprInteractiveTrainOptions :

Repr InteractiveTrainOptions

def NN.API.text.mkInteractiveTrainOptions (train : Common.ModelTrainFlags) (interactive : InteractiveOptions) :

InteractiveTrainOptions

Build the shared train-flags + interactive option record.

Instances For

def NN.API.text.InteractiveTrainOptions.parse (exeName : String) (args : List String) (defaultLogJson : System.FilePath) (defaultSteps : ℕ) (defaultLr : Float) (allowZeroSteps : Bool := false) :

Except String (InteractiveTrainOptions × List String)

Parse the shared "train + interactive" option surface.

Instances For

structure NN.API.text.LoggedPromptInteractiveOptionsextends NN.API.text.LoggedInteractiveOptions, NN.API.text.PromptGenerationOptions :

Logged-training options for promptable interactive text commands.

Instances For

def NN.API.text.instReprLoggedPromptInteractiveOptions.repr :

LoggedPromptInteractiveOptions → ℕ → Std.Format

Instances For

@[implicit_reducible]

instance NN.API.text.instReprLoggedPromptInteractiveOptions :

Repr LoggedPromptInteractiveOptions

def NN.API.text.mkLoggedPromptInteractiveOptions (train : Common.LoggedTrainFlags) (prompt : PromptGenerationOptions) (interactive : InteractiveOptions) :

LoggedPromptInteractiveOptions

Build the shared logged-training + prompt + interactive option record.

Instances For

def NN.API.text.LoggedPromptInteractiveOptions.parse (exeName : String) (args : List String) (defaultLogJson : System.FilePath) (defaultSteps : ℕ) (promptDefaults : PromptGenerationOptions) :

Except String (LoggedPromptInteractiveOptions × List String)

Parse the shared "logged train + prompt + interactive" option surface.

Instances For

structure NN.API.text.CorpusLoggedPromptInteractiveOptionsextends NN.API.text.LoggedPromptInteractiveOptions :

Corpus-training options for promptable text commands.

This combines the common corpus, fine-tune, BPE, prompt, logging, and interactive controls without tying them to a particular model implementation.

steps : ℕ
batchSize : ℕ
log : Runtime.Training.LogDestination
logPath : System.FilePath
cudaMemWatch : ℕ
interactive : Bool
prompt : String
generate : ℕ
corpus : TextCorpusOptions
Required primary corpus path plus the small-data override.
finetune : FinetuneOptions
Optional second corpus pass after the main training run.
bpe : BpeCorpusOptions
Optional GPT-2 BPE tokenizer bundle.

Instances For

@[implicit_reducible]

instance NN.API.text.instReprCorpusLoggedPromptInteractiveOptions :

Repr CorpusLoggedPromptInteractiveOptions

def NN.API.text.instReprCorpusLoggedPromptInteractiveOptions.repr :

CorpusLoggedPromptInteractiveOptions → ℕ → Std.Format

Instances For

def NN.API.text.CorpusLoggedPromptInteractiveOptions.parse (exeName : String) (args : List String) (defaultLogJson : System.FilePath) (defaultSteps : ℕ) (promptDefaults : PromptGenerationOptions) :

Except String (CorpusLoggedPromptInteractiveOptions × List String)

Parse the shared "corpus + logged train + prompt + interactive + optional fine-tune/BPE" surface.

Instances For

structure NN.API.text.TrainGenerationOptionsextends NN.API.Common.ModelTrainFlags, NN.API.text.GenerationOptions :

Training options for text commands that train and then sample.

Instances For

def NN.API.text.instReprTrainGenerationOptions.repr :

TrainGenerationOptions → ℕ → Std.Format

Instances For

@[implicit_reducible]

instance NN.API.text.instReprTrainGenerationOptions :

Repr TrainGenerationOptions

def NN.API.text.mkTrainGenerationOptions (train : Common.ModelTrainFlags) (gen : GenerationOptions) :

TrainGenerationOptions

Build the shared train + generation option record.

Instances For

structure NN.API.text.WindowedTrainGenerationOptionsextends NN.API.text.TrainGenerationOptions, NN.API.Common.WindowOptions :

Training options for cyclic text trainers that also expose --windows.

Instances For

@[implicit_reducible]

instance NN.API.text.instReprWindowedTrainGenerationOptions :

Repr WindowedTrainGenerationOptions

def NN.API.text.instReprWindowedTrainGenerationOptions.repr :

WindowedTrainGenerationOptions → ℕ → Std.Format

Instances For

def NN.API.text.mkWindowedTrainGenerationOptions (train : Common.ModelTrainFlags) (gen : GenerationOptions) (window : Common.WindowOptions) :

WindowedTrainGenerationOptions

Build the shared train + generation + windows option record.

Instances For

def NN.API.text.WindowedTrainGenerationOptions.parse (exeName : String) (args : List String) (defaultLogJson : System.FilePath) (defaultSteps : ℕ) (defaultLr : Float) (defaultWindows : ℕ) (genDefaults : GenerationOptions) (allowZeroSteps : Bool := false) :

Except String (WindowedTrainGenerationOptions × List String)

Parse the standard "train + generate + windows" option surface.

Instances For

structure NN.API.text.CheckpointedWindowedTrainGenerationOptionsextends NN.API.text.WindowedTrainGenerationOptions, NN.API.Common.CheckpointOptions :

Training options for text commands that support save/load checkpoints.

Instances For

def NN.API.text.instReprCheckpointedWindowedTrainGenerationOptions.repr :

CheckpointedWindowedTrainGenerationOptions → ℕ → Std.Format

Instances For

@[implicit_reducible]

instance NN.API.text.instReprCheckpointedWindowedTrainGenerationOptions :

Repr CheckpointedWindowedTrainGenerationOptions

def NN.API.text.mkCheckpointedWindowedTrainGenerationOptions (train : Common.ModelTrainFlags) (gen : GenerationOptions) (window : Common.WindowOptions) (checkpoint : Common.CheckpointOptions) :

CheckpointedWindowedTrainGenerationOptions

Build the shared train + generation + windows + checkpoint option record.

Instances For

def NN.API.text.CheckpointedWindowedTrainGenerationOptions.parse (exeName : String) (args : List String) (defaultLogJson : System.FilePath) (defaultSteps : ℕ) (defaultLr : Float) (defaultWindows : ℕ) (genDefaults : GenerationOptions) (allowZeroSteps : Bool := false) :

Except String (CheckpointedWindowedTrainGenerationOptions × List String)

Parse the shared "train + generate + windows + checkpoint" option surface.

Instances For

structure NN.API.text.BatchedCheckpointedWindowedTrainGenerationOptionsextends NN.API.text.CheckpointedWindowedTrainGenerationOptions :

Training options for text commands with generic batch and context-length controls.

steps : ℕ
batchSize : ℕ
log : Runtime.Training.LogDestination
logPath : System.FilePath
cudaMemWatch : ℕ
lr : Float
prompt : String
generate : ℕ
temperature : Float
topK : ℕ
repeatPenalty : Float
repeatWindow : ℕ
seed : ℕ
asciiOnly : Bool
windows : ℕ
loadParams? : Option System.FilePath
saveParams? : Option System.FilePath
batch : ℕ
Number of independently sampled training windows per optimizer step.
seqLen : ℕ
Context length in tokens or characters.

Instances For

@[implicit_reducible]

instance NN.API.text.instReprBatchedCheckpointedWindowedTrainGenerationOptions :

Repr BatchedCheckpointedWindowedTrainGenerationOptions

def NN.API.text.instReprBatchedCheckpointedWindowedTrainGenerationOptions.repr :

BatchedCheckpointedWindowedTrainGenerationOptions → ℕ → Std.Format

Instances For

def NN.API.text.BatchedCheckpointedWindowedTrainGenerationOptions.parse (exeName : String) (args : List String) (defaultLogJson : System.FilePath) (defaultSteps : ℕ) (defaultLr : Float) (defaultWindows defaultBatch defaultSeqLen : ℕ) (genDefaults : GenerationOptions) (allowZeroSteps : Bool := false) :

Except String (BatchedCheckpointedWindowedTrainGenerationOptions × List String)

Parse the shared "train + generate + windows + checkpoint + batch + seq-len" option surface.

Instances For

structure NN.API.text.InteractiveCheckpointedWindowedTrainGenerationOptionsextends NN.API.text.CheckpointedWindowedTrainGenerationOptions, NN.API.text.InteractiveOptions :

Training options for text commands with sampling, checkpointing, and an interactive prompt loop.

Instances For

def NN.API.text.instReprInteractiveCheckpointedWindowedTrainGenerationOptions.repr :

InteractiveCheckpointedWindowedTrainGenerationOptions → ℕ → Std.Format

Instances For

@[implicit_reducible]

instance NN.API.text.instReprInteractiveCheckpointedWindowedTrainGenerationOptions :

Repr InteractiveCheckpointedWindowedTrainGenerationOptions

def NN.API.text.mkInteractiveCheckpointedWindowedTrainGenerationOptions (train : Common.ModelTrainFlags) (gen : GenerationOptions) (window : Common.WindowOptions) (checkpoint : Common.CheckpointOptions) (interactive : InteractiveOptions) :

InteractiveCheckpointedWindowedTrainGenerationOptions

Build the full train + generation + windows + checkpoint + interactive option record.

Instances For

def NN.API.text.InteractiveCheckpointedWindowedTrainGenerationOptions.parse (exeName : String) (args : List String) (defaultLogJson : System.FilePath) (defaultSteps : ℕ) (defaultLr : Float) (defaultWindows : ℕ) (genDefaults : GenerationOptions) (allowZeroSteps : Bool := false) :

Except String (InteractiveCheckpointedWindowedTrainGenerationOptions × List String)

Parse the full "train + generate + windows + checkpoint + interactive" option surface.

Instances For

def NN.API.text.topKIndices (scores : Array Float) (k : ℕ) :

Return the indices of the top k scores (largest first).

This deterministic utility is used by the GPT-style examples. The direct O(k*vocab) implementation is adequate for the vocabulary sizes and top-k values used by these executable examples.

Instances For

def NN.API.text.greedyIndex (scores : Array Float) :

Greedy argmax index.

Instances For

def NN.API.text.penalizeRepeats (scores : Array Float) (recent : List ℕ) (repeatPenalty : Float) :

Apply a repetition penalty by subtracting repeatPenalty * count(token) for tokens appearing in recent.

This is a local sampling heuristic; it is not the same as the presence or frequency penalties used by hosted APIs, but it gives examples a deterministic way to discourage immediate repetition.

Instances For

def NN.API.text.restrictScores (scores : Array Float) (allowId : ℕ → Bool) :

Mask scores by an allow-list predicate (disallowed ids get a very negative score).

This is mainly used by byte-level examples to optionally restrict output to printable ASCII.

Instances For

def NN.API.text.prepareScoresForGeneration (scores : Array Float) (recent : List ℕ) (repeatPenalty : Float) (allowId : ℕ → Bool := fun (x : ℕ) => true) :

Apply repeat penalty and an allow-list mask before sampling.

Instances For

def NN.API.text.printableAsciiByte (i : ℕ) :

Printable ASCII bytes plus newline.

Instances For

def NN.API.text.escapeByteId (b : ℕ) :

Escape one byte token for display inside a quoted string.

Instances For

def NN.API.text.escapeByteIdsForDisplay (ids : List ℕ) :

Escape byte ids as a one-line quoted display string.

Instances For

def NN.API.text.sampleTopKIndex (scores : Array Float) (temperature : Float) (topK seed counter : ℕ) :

Sample one token id from scores using temperature + top-k sampling.

The randomness is deterministic given (seed, counter), so a run with the same flags produces the same sampled text.

Instances For

def NN.API.text.chooseNextToken (scores : Array Float) (opts : GenerationOptions) (counter : ℕ) (recent : List ℕ := []) (allowId : ℕ → Bool := fun (x : ℕ) => true) :

Select the next token from prepared logits using greedy or temperature/top-k sampling.

Instances For

def NN.API.text.autoregressiveTokenIds (seqLen padId : ℕ) (promptIds : List ℕ) (opts : GenerationOptions) (scoreWindow : List ℕ → ℕ → IO (Array Float)) (allowId : ℕ → Bool := fun (x : ℕ) => true) (sanitize : ℕ → ℕ := fun (tok : ℕ) => tok) :

Autoregressively extend token ids with a model-provided score callback.

The callback receives the padded context window and the sequence position whose logits should be used for the next token. The shared policy crops to the last seqLen tokens, pads, applies repeat penalties, samples by top-k/temperature, and appends one token per step.

Instances For

partial def NN.API.text.autoregressiveTokenIds.loop (seqLen padId : ℕ) (opts : GenerationOptions) (scoreWindow : List ℕ → ℕ → IO (Array Float)) (allowId : ℕ → Bool) (sanitize : ℕ → ℕ) (ids : List ℕ) :

ℕ → IO (List ℕ)

def NN.API.text.logitScoresAt {seqLen vocab : ℕ} (logits : Tensor.Tensor Float (Shape.Mat seqLen vocab)) (pos : ℕ) :

Extract the vocabulary-score row at one sequence position.

Instances For

def NN.API.text.batchLogitScoresAt {batch seqLen vocab : ℕ} (logits : Tensor.Tensor Float (Spec.Shape.dim batch (Shape.Mat seqLen vocab))) (batchIdx : Fin batch) (pos : ℕ) :

Extract a vocabulary-score row from batched logits.

Instances For

def NN.API.text.argmaxTokenIdsFromLogits {α : Type} [LT α] [DecidableRel fun (x1 x2 : α) => x1 > x2] {seqLen vocab : ℕ} (logits : Tensor.Tensor α (Shape.Mat seqLen vocab)) :

Decode a matrix of token logits by taking argmax independently at each sequence position.

The shape is (seqLen × vocab), i.e. one logits vector per token position. This helper is for inspection/debugging and is not differentiable.

Instances For

def NN.API.text.decodeArgmaxLogits {α : Type} [LT α] [DecidableRel fun (x1 x2 : α) => x1 > x2] (t : Tokenizer) {seqLen vocab : ℕ} (logits : Tensor.Tensor α (Shape.Mat seqLen vocab)) :

Decode (seqLen × vocab) logits as text using a tokenizer.

Instances For

def NN.API.text.argmaxTokenIdsFromBatchLogits {α : Type} [LT α] [DecidableRel fun (x1 x2 : α) => x1 > x2] {batch seqLen vocab : ℕ} (logits : Tensor.Tensor α (Spec.Shape.dim batch (Shape.Mat seqLen vocab))) (batchIdx : Fin batch) :

Extract batchIdx from batched logits and return the per-position argmax token ids.

Instances For

def NN.API.text.decodeArgmaxBatchLogits {α : Type} [LT α] [DecidableRel fun (x1 x2 : α) => x1 > x2] (t : Tokenizer) {batch seqLen vocab : ℕ} (logits : Tensor.Tensor α (Spec.Shape.dim batch (Shape.Mat seqLen vocab))) (batchIdx : Fin batch) :

Decode one batch row of (batch × seqLen × vocab) logits as text.

Instances For

def NN.API.text.causalMask (seqLen : ℕ) :

Tensor.Tensor Bool (Spec.Shape.dim seqLen (Spec.Shape.dim seqLen Spec.Shape.scalar))

Causal (autoregressive) attention mask of shape (seqLen × seqLen).

Entry (i, j) is true iff j ≤ i, meaning position i may attend to itself and earlier positions but not to future positions.

Instances For