TorchLean API

NN.Spec.Models.Gmm

Gaussian Mixture Model (GMM) (spec model) #

This file defines a basic GMM with nComponents multivariate Gaussians over nFeatures:

gmm_forward_spec computes per-component log-probabilities for a single input:

log π_k + log N(x | μ_k, Σ_k)

PyTorch analogies:

Implementation note: determinants/inverses are defined via NN.Spec.Models.CommonHelpers. Those are intended for small feature dimensions and proof/reference usage, not high‑performance clustering on large matrices.

References (background, not required to read the code):

Parameters #

structure Spec.GMMSpec (α : Type) (nComponents nFeatures : ) :

Parameters of a Gaussian mixture model (GMM).

Instances For
    def Spec.gmmForwardSpec {α : Type} [Context α] {nComponents nFeatures : } (m : GMMSpec α nComponents nFeatures) (input : Tensor α (Shape.dim nFeatures Shape.scalar)) :
    Tensor α (Shape.dim nComponents Shape.scalar)

    Per-component log-probabilities for a single input.

    Given x : ℝ^d, each component contributes:

    log π_k - 1/2 * ( (x-μ_k)^T Σ_k^{-1} (x-μ_k) + log det Σ_k + d * log(2π) )

    This is the natural "logit vector" for responsibilities. If you want posterior probabilities P(z=k | x), apply gmm_expectation_spec (a last-axis softmax).

    PyTorch analogy: the returned vector is like per-component log_prob values before the final mixture logsumexp.

    Instances For
      def Spec.gmmExpectationSpec {α : Type} [Context α] {nComponents nFeatures : } (m : GMMSpec α nComponents nFeatures) (input : Tensor α (Shape.dim nFeatures Shape.scalar)) (_h : nComponents 0) :
      Tensor α (Shape.dim nComponents Shape.scalar)

      E-step responsibilities for a single input.

      Mathematically:

      γ_k = P(z=k | x) = softmax_k ( log π_k + log N(x | μ_k, Σ_k) )

      PyTorch analogy: torch.softmax(component_log_probs, dim=-1) where the logits are the per-component log-probabilities.

      Instances For
        def Spec.gmmBatchedForwardSpec {α : Type} [Context α] {batch nComponents nFeatures : } (m : GMMSpec α nComponents nFeatures) (input : Tensor α (Shape.dim batch (Shape.dim nFeatures Shape.scalar))) :
        Tensor α (Shape.dim batch (Shape.dim nComponents Shape.scalar))

        Batched forward pass: apply gmm_forward_spec to each sample in a batch.

        Instances For

          Backward/VJP (for gmm_forward_spec) #

          gmm_forward_spec is vector-valued: it returns one log-probability per component.

          The gradients below are the VJP for that vector function. In particular, responsibilities γ = softmax(component_log_probs) do not appear in these formulas by themselves.

          Responsibilities show up when you differentiate a scalar objective that aggregates components, like the mixture log-likelihood logsumexp(component_log_probs). In that case, you compute dL/d(component_log_probs) first (which will involve γ), then feed that vector into gmm_backward_spec.

          def Spec.gmmWeightsDerivSpec {α : Type} [Context α] {nComponents nFeatures : } (m : GMMSpec α nComponents nFeatures) (grad_output : Tensor α (Shape.dim nComponents Shape.scalar)) (_h : nComponents 0) :
          Tensor α (Shape.dim nComponents Shape.scalar)

          Gradient/VJP w.r.t. weights π for the output of gmm_forward_spec.

          For y_k = log π_k + ..., we have ∂y_k/∂π_k = 1/π_k.

          Instances For
            def Spec.gmmMeansDerivSpec {α : Type} [Context α] {nComponents nFeatures : } (m : GMMSpec α nComponents nFeatures) (input : Tensor α (Shape.dim nFeatures Shape.scalar)) (grad_output : Tensor α (Shape.dim nComponents Shape.scalar)) (_h : nComponents 0) :
            Tensor α (Shape.dim nComponents (Shape.dim nFeatures Shape.scalar))

            Gradient/VJP w.r.t. means μ for the output of gmm_forward_spec.

            For a single component:

            ∂/∂μ log N(x|μ,Σ) = Σ^{-1} (x - μ).

            Instances For
              def Spec.gmmInputDerivSpec {α : Type} [Context α] {nComponents nFeatures : } (m : GMMSpec α nComponents nFeatures) (input : Tensor α (Shape.dim nFeatures Shape.scalar)) (grad_output : Tensor α (Shape.dim nComponents Shape.scalar)) (h : nComponents 0) :

              Gradient/VJP w.r.t. the input x for the output of gmm_forward_spec.

              For one component:

              ∂/∂x log N(x|μ,Σ) = - Σ^{-1} (x - μ).

              We sum the contributions from all components, weighted by the upstream gradient g_k.

              Instances For
                def Spec.gmmCovariancesDerivSpec {α : Type} [Context α] {nComponents nFeatures : } (m : GMMSpec α nComponents nFeatures) (input : Tensor α (Shape.dim nFeatures Shape.scalar)) (grad_output : Tensor α (Shape.dim nComponents Shape.scalar)) (_h : nComponents 0) :
                Tensor α (Shape.dim nComponents (Shape.dim nFeatures (Shape.dim nFeatures Shape.scalar)))

                Gradient/VJP w.r.t. covariances Σ for the output of gmm_forward_spec.

                For one component:

                ∂/∂Σ log N(x|μ,Σ) = 1/2 * ( Σ^{-1} (x-μ)(x-μ)^T Σ^{-1} - Σ^{-1} ).

                Instances For
                  def Spec.gmmBackwardSpec {α : Type} [Context α] {nComponents nFeatures : } (m : GMMSpec α nComponents nFeatures) (input : Tensor α (Shape.dim nFeatures Shape.scalar)) (grad_output : Tensor α (Shape.dim nComponents Shape.scalar)) (h : nComponents 0) :
                  Tensor α (Shape.dim nComponents Shape.scalar) × Tensor α (Shape.dim nComponents (Shape.dim nFeatures Shape.scalar)) × Tensor α (Shape.dim nComponents (Shape.dim nFeatures (Shape.dim nFeatures Shape.scalar))) × Tensor α (Shape.dim nFeatures Shape.scalar)

                  Backward/VJP for gmm_forward_spec.

                  Returns gradients with respect to (weights, means, covariances, input).

                  Instances For
                    def Spec.gmmInitSpec {α : Type} [Context α] {nComponents nFeatures : } :
                    GMMSpec α nComponents nFeatures

                    Default initialization for a GMM.

                    This is intentionally simple and deterministic:

                    • uniform weights,
                    • zero means,
                    • identity covariances.
                    Instances For
                      def Spec.logSumExpReduce {α : Type} [Context α] {n : } (log_probs : Tensor α (Shape.dim n Shape.scalar)) (h : n 0) :
                      α

                      Numerically stable log-sum-exp reduction: log (Σ_i exp(log_probs[i])).

                      This is the standard max + log(sum(exp(x - max))) trick.

                      Instances For
                        def Spec.gmmLogLikelihoodSpec {α : Type} [Context α] {nComponents nFeatures : } (m : GMMSpec α nComponents nFeatures) (input : Tensor α (Shape.dim nFeatures Shape.scalar)) (h : nComponents 0) :
                        α

                        Mixture log-likelihood log p(x) computed via log-sum-exp over components.

                        Mathematically: log p(x) = log (Σ_k exp(log p(x | z_k) + log π_k)).

                        Instances For

                          Classical training: EM for Gaussian mixtures #

                          For a GMM, “training” is typically done with the Expectation–Maximization (EM) algorithm:

                          This file already provides gmm_expectation_spec (responsibilities for one sample). The helpers below lift that to a batched dataset and implement a deterministic EM update step.

                          Numerical notes:

                          def Spec.gmmResponsibilitiesBatchedSpec {α : Type} [Context α] {nSamples nComponents nFeatures : } (m : GMMSpec α nComponents nFeatures) (data : Tensor α (Shape.dim nSamples (Shape.dim nFeatures Shape.scalar))) (hK : nComponents 0) :
                          Tensor α (Shape.dim nSamples (Shape.dim nComponents Shape.scalar))

                          Batched responsibilities: apply gmm_expectation_spec to each sample.

                          Instances For
                            def Spec.gmmEmStepSpec {α : Type} [Context α] {nSamples nComponents nFeatures : } (m : GMMSpec α nComponents nFeatures) (data : Tensor α (Shape.dim nSamples (Shape.dim nFeatures Shape.scalar))) (hK : nComponents 0) :
                            GMMSpec α nComponents nFeatures

                            One EM step for a batched dataset.

                            Instances For
                              def Spec.gmmNegLogLikelihoodBatchedSpec {α : Type} [Context α] {nSamples nComponents nFeatures : } (m : GMMSpec α nComponents nFeatures) (data : Tensor α (Shape.dim nSamples (Shape.dim nFeatures Shape.scalar))) (hK : nComponents 0) :
                              α

                              Total negative log-likelihood of a dataset under the current model.

                              Instances For
                                def Spec.gmmEmTrainSpec {α : Type} [Context α] {nSamples nComponents nFeatures : } (epochs : ) (m : GMMSpec α nComponents nFeatures) (data : Tensor α (Shape.dim nSamples (Shape.dim nFeatures Shape.scalar))) (hK : nComponents 0) :
                                GMMSpec α nComponents nFeatures

                                Run epochs EM steps (deterministic).

                                Instances For