Diffusion Model Helpers (API) #
Config-style diffusion model constructors plus reusable, dataset-independent DDPM/DDIM helpers.
The runnable examples decide where data comes from (CIFAR-10, ImageNet-style folders, synthetic fixtures). The definitions here are shape-parametric and can be reused by tests, examples, and future proof-facing specifications.
Instances For
Instances For
Build a minimal epsilon-predictor conv net:
conv -> relu -> conv -> relu -> conv -> relu -> conv.
This stays compact enough for the eager CUDA example while giving the CIFAR trainer more denoising capacity than a bare two-layer smoke-test network.
Instances For
Build a stronger same-resolution residual epsilon predictor.
Architecture:
stem conv -> relu -> residual block -> relu -> residual block -> relu -> output conv
Each residual block has shape hiddenC×H×W -> hiddenC×H×W and computes
x + conv(relu(conv(x))). This compact residual denoiser omits U-Net downsampling, upsampling,
and multi-scale skip concatenation. It is still a useful tutorial-scale architecture because
residual paths make the denoising problem much easier than a plain conv chain while staying within
the eager CUDA memory envelope used by examples.
Instances For
Append a constant time channel to an NCHW image batch.
The epsilon predictor consumes (data channels + 1) channels: noisy image channels plus a scalar
timestep broadcast over spatial positions.
Instances For
Build an epsilon-prediction training sample from explicit noise.
The caller supplies eps, usually from the runtime RNG. Keeping randomness outside this helper
makes the transformation reusable:
x_t = sqrt(ᾱ_t) * x₀ + sqrt(1 - ᾱ_t) * eps, target eps.
Instances For
One deterministic DDIM reverse update (η = 0).
Given x_t, predicted epsilon, and adjacent schedule values, this estimates x₀ and remixes it to
the previous timestep.
We clamp the intermediate x₀ estimate to the training image range [-1,1]. This is the standard
"clipped denoised" stabilizer used by many DDPM/DDIM samplers: without it, a small tutorial model can
drive one color channel far outside the data range and the final PPM exporter merely clips the
damage into saturated color blobs.
Instances For
Write the first image in an RGB NCHW batch as an ASCII PPM.
This dependency-free writer is for example artifacts and quick visual checks, not high-throughput image export.