Vector-quantized VAE (VQ-VAE) spec #
VQ-VAE replaces a continuous latent sample with a discrete codebook lookup. This file exposes the core mechanism in a theorem-friendly way:
- an encoder produces a continuous latent
z_e(x); - a code index selects a codebook vector
z_q; - a decoder reconstructs from
z_q; - the loss combines reconstruction, codebook, and commitment terms.
The nearest-neighbor assignment is deliberately an explicit Fin numCodes argument. That keeps
the spec total and avoids hiding tie-breaking policy in the mathematical layer; runtime code can
compute the index however it likes and then pass the verified index into this spec.
Reference:
- van den Oord, Vinyals, and Kavukcuoglu (2017), "Neural Discrete Representation Learning".
Encoder producing the pre-quantized latent vector z_e(x).
- forward : Spec.Tensor α obs → Spec.Tensor α latent
Continuous encoder output before codebook lookup.
Instances For
Decoder mapping a codebook vector back to observation space.
- forward : Spec.Tensor α latent → Spec.Tensor α obs
Decode a quantized latent vector.
Instances For
VQ-VAE model: encoder, codebook, and decoder.
- encoder : Encoder α obs latent
Continuous encoder.
- codebook : Latent.Codebook α numCodes latent
Finite codebook.
- decoder : Decoder α latent obs
Decoder from quantized latent vectors.
Instances For
Pre-quantized latent z_e(x).
Instances For
Quantized latent z_q, using an explicit code index.
Instances For
VQ-VAE reconstruction from an explicit code assignment.
Instances For
Reconstruction term ||dec(z_q)-x||².
Instances For
Codebook term ||z_q-z_e||², written symmetrically at spec level.
Instances For
Commitment term ||z_e-z_q||², weighted by β in the total objective.
Instances For
VQ-VAE objective: reconstruction + codebook + β commitment.
Instances For
The VQ-VAE objective decomposes into the three standard terms.