TorchLean API

NN.Examples.Models.Sequence.Transformer

Transformer Text Example #

Runnable torchlean transformer example. It reads a local text corpus, builds a byte-level sequence reconstruction sample, and trains one transformer encoder block on that real text window.

The reusable model wiring lives in NN.API.Models.Transformer (nn.models.transformerEncoder). This file is the runnable wrapper.

python3 scripts/datasets/download_example_data.py --tiny-shakespeare
lake build -R -K cuda=true && lake exe torchlean transformer --cuda --tiny-shakespeare --steps 1

Number of identical rows in the small batch used by this encoder check.

Instances For

    Short sequence length: enough to exercise attention without making CPU runs painful.

    Instances For

      Transformer feature width.

      Instances For

        Number of attention heads.

        Instances For

          Feed-forward hidden width inside the encoder block.

          Instances For

            API-level encoder configuration shared by shapes and the constructor.

            Instances For
              @[reducible, inline]
              Instances For
                @[reducible, inline]
                Instances For

                  Build one batch by repeating a real-text causal sample.

                  This is intentionally an encoder-block reconstruction example, not autoregressive generation. The causal GPT/Mamba files cover language-model decoding; this file keeps the attention block itself small and easy to sanity-check.

                  Instances For

                    Shared runner configuration for torchlean transformer.

                    We intentionally reuse the same training infrastructure as torchlean rnn and torchlean lstm: the goal here is to compare the architecture (attention/norm/FFN) rather than read three copies of the same CLI/runtime wrapper.

                    Instances For