Diffusion Training Example #
Runnable torchlean diffusion example.
This is the maintained diffusion command. It supports two real-data modes:
--dataset imagenet64(default): user-provided ImageNet/Imagenette/Tiny-ImageNet-style images converted to(N,3,64,64).npytensors.--dataset cifar10: prepared CIFAR-10(N,3,32,32)arrays.
The command is one public entrypoint, but the implementation keeps separate typed branches because Lean tracks image height and width in the tensor type.
Why unconditional samples are still rough #
The default epsilon predictor is a small same-resolution residual CNN with a broadcast time channel. That is enough to validate real image loading, CUDA training, logging, reconstruction diagnostics, and DDIM replay from Lean. It is deliberately not advertised as a high-fidelity image generator: good unconditional samples need a full U-Net with multiscale skips, richer timestep embeddings, EMA, more training, more timesteps, and runtime support that avoids eager-autograd buffer blow-up for wider models.
Examples #
Prepare ImageNet-style data:
python3 scripts/datasets/torchlean_data_convert.py image-folder \
--input /path/to/imagenet/train \
--x-output data/real/imagenet64/imagenet64_train_X.npy \
--y-output data/real/imagenet64/imagenet64_train_y.npy \
--height 64 --width 64 --labels-from-dirs --limit 800
Train on ImageNet64 and save visual artifacts:
lake build -R -K cuda=true
CUDA_VISIBLE_DEVICES=0 lake exe -K cuda=true torchlean diffusion --cuda --fast-kernels \
--dataset imagenet64 --n-total 800 --steps 1000 --hidden-c 8 --T 100 --beta-end 0.12 \
--reference-ppm data/model_zoo/diffusion_reference.ppm \
--noisy-ppm data/model_zoo/diffusion_noisy.ppm \
--reconstruct-ppm data/model_zoo/diffusion_reconstruct.ppm \
--sample-ppm data/model_zoo/diffusion_sample.ppm
CIFAR smoke path:
python3 scripts/datasets/download_example_data.py --cifar10
lake exe torchlean diffusion --dataset cifar10 --cuda --fast-kernels --steps 200
Instances For
Instances For
Instances For
Build the default epsilon predictor for a specific typed image shape.
We use the residual CNN from NN.API.Models.Diffusion: it is still small enough for tutorial-scale
CUDA runs, but the skip paths train much better than the plain convolution chain. The plain
epsConvNet remains in the API as the smaller baseline; this example uses the residual default so
the documented command matches the maintained training path.
Instances For
Map converted image tensors from [0,1] into the standard diffusion range [-1,1].
The input is already NCHW because the dataset converter and RealData loaders enforce that layout.
Instances For
Instances For
Instances For
Instances For
Instances For
Instances For
Reverse DDIM from a chosen timestep for reconstruction diagnostics.
This is intentionally separate from unconditional sampling. It lets us corrupt a real image to a moderate timestep, denoise from there, and check whether reconstruction improves over the noisy input.
Instances For
- steps : ℕ
- logEvery : ℕ
- lr : Float
- T : ℕ
- betaStart : Float
- betaEnd : Float
- samplePpm? : Option System.FilePath
- referencePpm? : Option System.FilePath
- noisyPpm? : Option System.FilePath
- reconstructPpm? : Option System.FilePath
Instances For
Shared training loop for both CIFAR-10 and ImageNet64 branches.
The loop optimizes epsilon prediction and can emit four visual artifacts:
reference-ppm: clean evaluation image,noisy-ppm: clean image after forward diffusion toreconstruct-step,reconstruct-ppm: DDIM denoising from that timestep,sample-ppm: unconditional DDIM sample from Gaussian noise.
Instances For
- imagenet64 : DatasetChoice
- cifar10 : DatasetChoice