TorchLean Cross-Entropy Trainer Implementation #
One-hot cross-entropy training for classifiers, text windows, and structured logit tensors.
Cast a Float checkpoint payload to the runtime scalar selected for the current run.
Instances For
Convert a runtime parameter payload back to the public Float checkpoint format.
Instances For
Load optional checkpoint bits into a cross-entropy runner before training.
The saved file is checked against the model's parameter shapes first, then cast into the selected runtime scalar. That means stale checkpoints fail at the boundary instead of silently perturbing a training run.
Instances For
Save optional trained checkpoint bits from a cross-entropy runner.
Instances For
Run a general cross-entropy trainer directly from a public RunConfig.
This is the same direct runtime seam used by regression/classifier. The important part is what it
does not do: it does not serialize the config back into CLI flags or expose a Runner callback to
ordinary examples.
Instances For
Shared cross-entropy training core for already-parsed public runtime settings.
This core is generic over shapes. It works for byte-level language-model windows, sequence-to-sequence one-hot targets, and other supervised tasks whose target is already encoded as a one-hot tensor with the same shape expected by the model loss.
Instances For
Shared cross-entropy training core for already-parsed runtime settings.
CLI commands may select the scalar type before calling into the public trainer. This entrypoint keeps that path inside the facade instead of exposing manual module calls to examples.
Instances For
Train on an in-memory one-hot cross-entropy dataset using an explicit runtime override.
Use this when one call should temporarily override the optimizer/backend/dtype/device settings attached to the trainer.
Instances For
Train with a runtime scalar that has already been selected by the caller.
Call trainer.train when the scalar should be chosen from trainer.runConfig.dtype.
This method exists for model-zoo dispatchers that already run inside a callback like
ModelZoo.runAnyOrFloatNoCast: at that point Lean has a concrete scalar type α, not merely a
Runtime.DType tag. The important ergonomic point is that even these advanced examples still use
the public trainer surface, so they do not have to open-code module creation, loss calls, or
optimizer steps.
Instances For
Train on an in-memory one-hot cross-entropy dataset using the trainer's attached runtime settings.
This is the sequence-model implementation behind trainer.train: persistent runtime choices live on
the trainer, while step/logging choices live on TrainOptions.