CIFAR10-style image loader tutorial (NPY, offline) #
This tutorial mirrors a classic PyTorch recipe:
- Load a labeled image dataset from disk (
.npyexported from NumPy/PyTorch). - Split into train/test.
- Build a small CNN by explicitly stacking layers.
- Train for multiple epochs over shuffled minibatches and report loss.
To keep this runnable without network downloads, generate a small deterministic "CIFAR10-shaped" dataset locally:
NN/Examples/Data/toy_cifar10like_X.npy: shape(200, 3, 32, 32), dtypefloat32NN/Examples/Data/toy_cifar10like_y.npy: shape(200,), dtypefloat32labels0..9
Generate it with:
python3 NN/Examples/Data/generate_toy_data.py
Build:
lake build NN.Examples.Data.Loaders.Cifar10Images
For command-line CIFAR training, use torchlean cnn, torchlean resnet, or torchlean vit with
--x, --y, and --n-total.
Optional flags (tutorial-specific):
--data-dir PATH(default:NN/Examples/Data)--real-cifar10(usedata/real/cifar10/cifar10_train_*.npy, as prepared byscripts/datasets/download_example_data.py --cifar10)--x PATH,--y PATH(override the.npyfiles)--n-total N(number of rows in the selected.npyfiles; default200)--seed S(controls split + shuffling + model initialization)--batch N--epochs E--lr LR(default:0.001)--log-every N(default:1; pass0to silence per-step loss)--train-size N(default: 160)--quick(load/split smoke test; skip the expensive eager CPU CNN training loop)
Why this tutorial matters:
- it shows the public
API.Datafile-loading path rather than only in-memory tensors; - it keeps the model architecture familiar (Conv/ReLU/Pool stack + classifier head);
- it demonstrates the "offline artifact" workflow many PyTorch users already have, where arrays
have been pre-exported to
.npyand training happens without any dataset download step.
def
NN.Examples.Data.Loaders.Cifar10Images.mkModel
{batch : ℕ}
:
API.nn.M (API.nn.Sequential (Shape.Images batch channels height width) (Tensor.shapeOfDims [batch, classes]))
Small CNN (no BatchNorm): Conv -> ReLU -> Pool -> Conv -> ReLU -> Pool -> Linear(10).
Instances For
def
NN.Examples.Data.Loaders.Cifar10Images.loadDataset
(xPath yPath : System.FilePath)
(n : ℕ)
{α : Type}
[API.Semantics.Scalar α]
[API.Runtime.Scalar α]
:
Load the offline CIFAR10-like .npy dataset at the runtime-selected scalar type α.