Transformer Text Example #

Runnable torchlean transformer example. It reads a local text corpus, builds a byte-level sequence reconstruction sample, and trains one transformer encoder block on that real text window.

The reusable model wiring lives in NN.API.Models.Transformer (nn.models.transformerEncoder). This file is the runnable wrapper.

python3 scripts/datasets/download_example_data.py --tiny-shakespeare
lake build -R -K cuda=true && lake exe torchlean transformer --cuda --tiny-shakespeare --steps 1

source

def NN.Examples.Models.Sequence.Transformer.exeName :

String

Instances For

source

def NN.Examples.Models.Sequence.Transformer.defaultLogJson :

System.FilePath

Instances For

source

def NN.Examples.Models.Sequence.Transformer.batch :

ℕ

Number of identical rows in the small batch used by this encoder check.

Instances For

source

def NN.Examples.Models.Sequence.Transformer.seqLen :

ℕ

Short sequence length: enough to exercise attention without making CPU runs painful.

Instances For

source

def NN.Examples.Models.Sequence.Transformer.dModel :

ℕ

Transformer feature width.

Instances For

source

def NN.Examples.Models.Sequence.Transformer.numHeads :

ℕ

Number of attention heads.

Instances For

source

def NN.Examples.Models.Sequence.Transformer.headDim :

ℕ

Per-head width; numHeads * headDim matches dModel.

Instances For

source

def NN.Examples.Models.Sequence.Transformer.ffnHidden :

ℕ

Feed-forward hidden width inside the encoder block.

Instances For

source

def NN.Examples.Models.Sequence.Transformer.cfg :

API.nn.models.TransformerEncoderConfig

API-level encoder configuration shared by shapes and the constructor.

Instances For

source

@[reducible, inline]

abbrev NN.Examples.Models.Sequence.Transformer.σ :

Shape

Instances For

source

@[reducible, inline]

abbrev NN.Examples.Models.Sequence.Transformer.τ :

Shape

Instances For

source

def NN.Examples.Models.Sequence.Transformer.mkModel :

API.nn.M (API.nn.Sequential σ τ)

Instances For

source

def NN.Examples.Models.Sequence.Transformer.mkSample {α : Type} [API.Semantics.Scalar α] [API.Runtime.Scalar α] (input : String) :

API.sample.Supervised α σ τ

Build one batch by repeating a real-text causal sample.

This is intentionally an encoder-block reconstruction example, not autoregressive generation. The causal GPT/Mamba files cover language-model decoding; this file keeps the attention block itself small and easy to sanity-check.

Instances For

source

def NN.Examples.Models.Sequence.Transformer.runner :

SimpleText.RunnerConfig σ τ

Shared runner configuration for torchlean transformer.

We intentionally reuse the same training infrastructure as torchlean rnn and torchlean lstm: the goal here is to compare the architecture (attention/norm/FFN) rather than read three copies of the same CLI/runtime wrapper.

Instances For

source

def NN.Examples.Models.Sequence.Transformer.main (args : List String) :

IO UInt32

Instances For