TorchLean Public Datasets #
Core public dataset constructors and file-backed dataset loaders.
Stateful minibatch loader used by public dataset APIs.
Instances For
Runtime-polymorphic supervised dataset for Float tensors.
Use this for most tutorials and file-loader paths: Float data is cast into whichever runtime scalar
the command selected with --dtype.
Instances For
Runtime-polymorphic supervised dataset from an explicit sample builder.
Use this when Lean code generates samples directly rather than loading them from batched tensors,
CSV, or NPY files. Sequence windows, synthetic PDE batches, and task-specific examples can keep
their own sample logic while still returning a standard Trainer.Dataset.
Instances For
Build a singleton dataset from one runtime-polymorphic supervised sample.
Small examples can use the Trainer.Dataset surface without fixing the runtime scalar or backend
in the dataset definition itself.
Instances For
Build a singleton dataset by feeding one explicit argument into a runtime-polymorphic sample constructor.
Use this when the sample construction depends on one user-facing payload, such as a prompt string or one file-backed record.
Instances For
Build a singleton dataset from one Float sample produced inside IO.
Use this when the sample comes from a file-backed or runtime-loaded Float boundary. The public
trainer still owns the scalar/backend choice through Trainer.RunConfig and Trainer.TrainOptions.
Instances For
Runtime-polymorphic dataset from an in-memory list of Float supervised samples.
Several examples build their training windows in ordinary Float first because the source is text,
CSV, NPY, or another external boundary. This constructor keeps those examples on the public trainer
surface: the samples are still cast to the runtime-selected scalar at training time, so callers do not
have to write their own scalar-polymorphic dataset adapter.
Instances For
Array form of floatSamples.
Instances For
Convert an unbatched supervised dataset into a fixed-size batched dataset.
This is the public "minibatch the dataset first" adapter for examples that want the model itself to
own the batch axis. The returned dataset stores samples of shape dim batch σ and dim batch τ,
so it can be passed directly to Trainer.new with a batched model.
Instances For
Split a public dataset into deterministic train/test views.
This is the dataset-level analogue of torch.utils.data.random_split: the split happens after the
trainer materializes the runtime scalar, but callers stay on ordinary Trainer.Dataset values.
Instances For
Load a numeric CSV table as a dataset of fixed-size tabular regression batches.
Each CSV row is interpreted as inDim feature columns followed by outDim target columns.
The returned dataset already has the leading batch dimension expected by a model with input
shape Shape.mat batch inDim and output shape Shape.mat batch outDim.
Instances For
Runtime-polymorphic supervised regression dataset from a tensor source.
This is the public file-data analogue of torch.utils.data.TensorDataset(X, Y) for examples whose
targets are tensors rather than class labels. The source records where batched features and targets
live; the trainer materializes them at the selected scalar type.
Instances For
Runtime-polymorphic supervised dataset backed by paired .npy files.
This is the common file-backed regression/operator-learning path: xPath stores samples with shape
xDims, and yPath stores matching targets with shape yDims.
Instances For
Runtime-polymorphic one-hot classification dataset from a tensor source.
This is the public file-data analogue of torch.utils.data.TensorDataset: the source records where
features and integer labels live, and the trainer materializes them at the selected scalar type.
Instances For
Construct a supervised source from paths using the same file format for x and y.
Instances For
Load a supervised dataset by slicing dim0 from the two tensors.
This is the preferred public loader for regression/operator-learning examples, regardless of
whether the backing files are .npy or small numeric CSV tables.
Instances For
Construct a labeled source from paths using the same file format for x and y.
Instances For
Load a labeled classification dataset by slicing dim0 and one-hot encoding labels.
For CSV label vectors, store labels as a single-column table with dims = [n, 1] and use a custom
TensorSource if needed; the path constructor above is aimed at .npy label vectors.