LSTM Text Example #

Runnable torchlean lstm example. It reads a local text corpus, creates a byte-level causal-language-model window, and trains an LSTM plus time-distributed linear head.

The model constructor lives in NN.API.Models.SimpleSeq so other examples can reuse it. This file keeps only the architecture-specific declarations; the shared corpus loading, CLI parsing, logging, and train loop live in NN.Examples.Models.Sequence.SimpleText.

What This Example Is (And Is Not) #

This is a small layer smoke test for the LSTM cell plus the TorchLean training loop. It uses a single fixed text window and a simple MSE-on-one-hot objective to keep runs short and predictable.

If you want a real language-model tutorial (proper autoregressive loss + longer context + sampling), use one of:

torchlean chargpt (Karpathy-style, single-file char-level GPT),
torchlean gpt2 (byte-level GPT-2-style model + save/reload),
torchlean text_gpt2 (CUDA corpus trainer).

python3 scripts/datasets/download_example_data.py --tiny-shakespeare
lake build -R -K cuda=true && lake exe torchlean lstm --cuda --tiny-shakespeare --steps 1

source

def NN.Examples.Models.Sequence.Lstm.exeName :

String

Instances For

source

def NN.Examples.Models.Sequence.Lstm.defaultLogJson :

System.FilePath

Instances For

source

def NN.Examples.Models.Sequence.Lstm.seqLen :

ℕ

Short byte-window length used for a quick gated-recurrent smoke test.

Instances For

source

def NN.Examples.Models.Sequence.Lstm.inputSize :

ℕ

Byte vocabulary size.

This example uses byte-level tokens (0..255) rather than hashing bytes down to a smaller bucket count. Earlier smoke tests used 32 here for speed, but the full byte vocab avoids unnecessary aliasing and makes the tutorial behavior easier to reason about.

Instances For

source

def NN.Examples.Models.Sequence.Lstm.hiddenSize :

ℕ

Hidden state width of the LSTM cell.

Instances For

source

def NN.Examples.Models.Sequence.Lstm.cfg :

API.nn.models.SeqRnnHeadConfig

Shared shape/config record consumed by the reusable API constructor.

Instances For

source