CSV loader tutorial (transforms + minibatches + scheduler) #

This tutorial mirrors the "data first" workflow people expect from PyTorch:

Load a dataset from disk (CSV).
Build a transform pipeline (Data.Transforms.Compose).
Wrap the per-sample dataset in a minibatch loader (Data.batchLoader).
Train with a learning-rate scheduler.

Generate a small deterministic regression dataset with python3 NN/Examples/Data/generate_toy_data.py:

NN/Examples/Data/toy_regression.csv with rows x1,x2,y (25 samples).

Build:

lake build NN.Examples.Data.Loaders.Csv

The tutorial code is compiled with the rest of TorchLean. For command-line model training, use the torchlean executable examples in NN/Examples/Models.

Optional flags (tutorial-specific):

--data-dir PATH (default: NN/Examples/Data)
--csv PATH (override the CSV file)
--seed S (controls shuffling and model initialization)
--batch N
--epochs E

Public API used here:

Data.fromCsvSupervised
Data.Transforms.Compose
Data.batchLoader
train.fitLoaderWith
train.stepEpochLR

source

def NN.Examples.Data.Loaders.Csv.inDim :

ℕ

Instances For

source

def NN.Examples.Data.Loaders.Csv.outDim :

ℕ

Instances For

source

def NN.Examples.Data.Loaders.Csv.mkModel {batch : ℕ} :

API.nn.M (API.nn.Sequential (Shape.Mat batch inDim) (Shape.Mat batch outDim))

A small 2-layer MLP 2 -> 8 -> 1.

Instances For

source

def NN.Examples.Data.Loaders.Csv.loadDataset (csvPath : System.FilePath) {α : Type} [API.Semantics.Scalar α] [API.Runtime.Scalar α] :

IO (Except String (Runtime.Autograd.Train.Dataset (API.sample.Supervised α (Shape.Vec inDim) (Shape.Vec outDim))))

Load the CSV dataset, then apply a small input transform pipeline.

The transform pipeline is written once for the chosen scalar type α:

normalize (here: mean=0, std=1, so it is an easy-to-read "template"), then
scale inputs by 0.5.

Instances For

source

def NN.Examples.Data.Loaders.Csv.main (args : List String) :

IO Unit

Instances For