LSTM Seasonal Regression / Forecasting #
Run this when you want a real regression-style sequence example, not just a text-shaped smoke test.
The default data path uses the UCI Individual Household Electric Power Consumption dataset: minute-level power readings from one household over almost four years. The preparation script turns that into hourly one-step forecasting windows:
past 24 hours -> next 24 shifted-by-one-hour targets
Prepare the real data once:
python3 scripts/datasets/download_example_data.py --household-power --household-power-windows 512
We keep the walkthrough small enough to inspect quickly:
- use
--steps 1when you only want to check CUDA and the loader; - use
--steps 200 --windows 96when you want to see the printed forecast move toward the target; - change
--probe-offsetto look at a different part of the power curve; - lower
--lrfirst if the probe gets worse instead of better.
lake exe -K cuda=true torchlean lstm_regression --cuda --steps 1 --windows 1
lake exe -K cuda=true torchlean lstm_regression --cuda --steps 200 --windows 96
Dataset citation: Hebrail and Berard, "Individual Household Electric Power Consumption", UCI Machine
Learning Repository, DOI 10.24432/C58K54, CC BY 4.0.
Runner subcommand: lake exe torchlean lstm_regression ....
Instances For
Default JSON path for the before/after loss.
Pass --log PATH to write somewhere else, or --log disabled when you only want terminal output.
Instances For
Default root for downloaded real datasets. Override with --data-dir.
Instances For
One day of hourly samples. Try 48 if you want a two-day context, but expect slower runs.
Instances For
One scalar feature. If you add calendar/weather features, increase this and update scalarRow.
Instances For
Shared recurrent-model configuration.
This keeps the model constructor, input shape, and output shape tied to the same three numbers:
seqLen, inputSize, and hiddenSize. When a shape error shows up, this is the first place to
check.
Instances For
Target/prediction shape: one next-step scalar at each of seqLen timesteps.
Instances For
The actual forecaster.
nn.models.lstmWithLinearHead cfg expands to:
nn.lstm seqLen inputSize hiddenSize
followed by a time-distributed nn.linear hiddenSize inputSize.
So every timestep emits a scalar forecast. We are not using only the final hidden state here; the loss checks the whole output sequence.
Instances For
Data source tags for terminal logs and JSON metadata.
Instances For
Load prepared UCI household-power windows through the shared .npy supervised source.
Instances For
Fallback sample used for defensive array access.
The CLI rejects --windows 0, so this is only here to keep helper calls total if this file gets
copied into a more experimental tutorial.
Instances For
Representative MSE over at most 32 windows.
This is just the quick "are we moving in the right direction?" number. It keeps evaluation cheap
even when --windows is large. If you want a real validation report, raise the cap or add a
separate held-out window list.
Instances For
Read t[row,0] from a seqLen × 1 tensor.
Probe printing uses this to show scalar values. The row is clamped so changing seqLen cannot make
the tutorial crash just because the print loop asks for too many rows.
Instances For
Print the first few predicted/target pairs for one offset.
This is the easiest way to debug whether the model is actually learning the curve. Try
--probe-offset 0, --probe-offset 96, or --probe-offset 144 to inspect different phases.
Instances For
Example-specific training options after TorchLean.Module.run has handled CPU/CUDA flags.
- steps : ℕ
Optimizer steps. Use
1on CUDA for smoke; use around200on CUDA to see learning. - windows : ℕ
Number of deterministic windows to cycle through. More windows mean more seasonal coverage.
- lr : Float
Adam learning rate. If the probe oscillates or explodes, lower this.
- probeOffset : ℕ
Start offset for the printed before/after probe. This does not affect training.
- xPath : System.FilePath
Prepared UCI household-power input windows. Override with
--x. - yPath : System.FilePath
Prepared UCI household-power target windows. Override with
--y. Logging destination: JSON path, stdout-style destination, or disabled.
- logPath : System.FilePath
Concrete default path when
logis path-backed.
Instances For
Instances For
Parse example-specific flags.
Runtime flags such as --cpu, --cuda, --dtype, and --backend are handled by
TorchLean.Module.run; this parser handles data, forecasting, and logging knobs.
Instances For
Train the LSTM forecaster and return (lossBefore, lossAfter).
The training recipe is plain on purpose:
- construct the model under
nn.withModel; - wrap it as a scalar MSE module;
- load UCI household-power windows;
- print a probe before training;
- run Adam over the loaded windows; and
- print the same probe after training.
Because the data is deterministic, bad changes are easy to spot: the loss should drop and the probe predictions should move toward the target values.
Instances For
Executable entrypoint.
This is a numeric training tutorial, so it uses the Float runtime path. If you are proving a layer
property, use the spec/proof files; if you want to see an LSTM actually fit a small forecasting task,
run this file.