Transformer Text Example #
Runnable torchlean transformer example. It reads a local text corpus, builds a byte-level
sequence reconstruction sample, and trains one transformer encoder block on that real text window.
The reusable model wiring lives in NN.API.Models.Transformer
(nn.models.transformerEncoder). This file is the runnable wrapper.
python3 scripts/datasets/download_example_data.py --tiny-shakespeare
lake build -R -K cuda=true && lake exe torchlean transformer --cuda --tiny-shakespeare --steps 1
Number of identical rows in the small batch used by this encoder check.
Instances For
Short sequence length: enough to exercise attention without making CPU runs painful.
Instances For
Number of attention heads.
Instances For
Feed-forward hidden width inside the encoder block.
Instances For
API-level encoder configuration shared by shapes and the constructor.
Instances For
Instances For
Build one batch by repeating a real-text causal sample.
This is intentionally an encoder-block reconstruction example, not autoregressive generation. The causal GPT/Mamba files cover language-model decoding; this file keeps the attention block itself small and easy to sanity-check.
Instances For
Shared runner configuration for torchlean transformer.
We intentionally reuse the same training infrastructure as torchlean rnn and torchlean lstm:
the goal here is to compare the architecture (attention/norm/FFN) rather than read three copies
of the same CLI/runtime wrapper.