TorchLean API

NN.Examples.Models.Generative.Diffusion

Diffusion Training Example #

Runnable torchlean diffusion example.

This is the maintained diffusion command. It supports two real-data modes:

The command is one public entrypoint, but the implementation keeps separate typed branches because Lean tracks image height and width in the tensor type.

Why unconditional samples are still rough #

The default epsilon predictor is a small same-resolution residual CNN with a broadcast time channel. That is enough to validate real image loading, CUDA training, logging, reconstruction diagnostics, and DDIM replay from Lean. It is deliberately not advertised as a high-fidelity image generator: good unconditional samples need a full U-Net with multiscale skips, richer timestep embeddings, EMA, more training, more timesteps, and runtime support that avoids eager-autograd buffer blow-up for wider models.

Examples #

Prepare ImageNet-style data:

python3 scripts/datasets/torchlean_data_convert.py image-folder \
  --input /path/to/imagenet/train \
  --x-output data/real/imagenet64/imagenet64_train_X.npy \
  --y-output data/real/imagenet64/imagenet64_train_y.npy \
  --height 64 --width 64 --labels-from-dirs --limit 800

Train on ImageNet64 and save visual artifacts:

lake build -R -K cuda=true
CUDA_VISIBLE_DEVICES=0 lake exe -K cuda=true torchlean diffusion --cuda --fast-kernels \
  --dataset imagenet64 --n-total 800 --steps 1000 --hidden-c 8 --T 100 --beta-end 0.12 \
  --reference-ppm data/model_zoo/diffusion_reference.ppm \
  --noisy-ppm data/model_zoo/diffusion_noisy.ppm \
  --reconstruct-ppm data/model_zoo/diffusion_reconstruct.ppm \
  --sample-ppm data/model_zoo/diffusion_sample.ppm

CIFAR smoke path:

python3 scripts/datasets/download_example_data.py --cifar10
lake exe torchlean diffusion --dataset cifar10 --cuda --fast-kernels --steps 200
@[reducible, inline]
Instances For
    @[reducible, inline]
    Instances For
      def NN.Examples.Models.Generative.Diffusion.mkModel (c h w hiddenC : ) [NeZero c] [NeZero h] [NeZero w] (h_hiddenC : hiddenC 0) :

      Build the default epsilon predictor for a specific typed image shape.

      We use the residual CNN from NN.API.Models.Diffusion: it is still small enough for tutorial-scale CUDA runs, but the skip paths train much better than the plain convolution chain. The plain epsConvNet remains in the API as the smaller baseline; this example uses the residual default so the documented command matches the maintained training path.

      Instances For

        Map converted image tensors from [0,1] into the standard diffusion range [-1,1].

        The input is already NCHW because the dataset converter and RealData loaders enforce that layout.

        Instances For
          def NN.Examples.Models.Generative.Diffusion.mkNoisedSample {c h w : } (alphaBars : Array Float) (T : ) (x0 : Spec.Tensor Float (x0Shape c h w)) (seed step : ) :
          Instances For

            Reverse DDIM from a chosen timestep for reconstruction diagnostics.

            This is intentionally separate from unconditional sampling. It lets us corrupt a real image to a moderate timestep, denoise from there, and check whether reconstruction improves over the noisy input.

            Instances For
              Instances For

                Shared training loop for both CIFAR-10 and ImageNet64 branches.

                The loop optimizes epsilon prediction and can emit four visual artifacts:

                • reference-ppm: clean evaluation image,
                • noisy-ppm: clean image after forward diffusion to reconstruct-step,
                • reconstruct-ppm: DDIM denoising from that timestep,
                • sample-ppm: unconditional DDIM sample from Gaussian noise.
                Instances For