PCA (spec model) #

Principal Component Analysis is represented as a linear projection onto learned components, plus an explicit mean for centering.

This file models only the transform (and inverse transform), not the procedure that learns principal components from data.

PyTorch / ecosystem analogies:

scikit-learn: sklearn.decomposition.PCA (fit + transform)
PyTorch: torch.pca_lowrank or torch.linalg.svd (common building blocks)

References (background, not required to read the code):

Pearson (1901), "On Lines and Planes of Closest Fit to Systems of Points in Space". https://doi.org/10.1080/14786440109462720
Hotelling (1933), "Analysis of a complex of statistical variables into principal components". https://doi.org/10.2307/2333955

source

structure Spec.PCASpec (α : Type) (inDim outDim : ℕ) :

Type

Parameters for PCA as a linear map plus centering.

We store:

components : outDim × inDim (rows are principal directions),
mean : inDim (for centering),
explained_variance : outDim (eigenvalues for the selected components).

This matches the typical PCA API: you can transform to outDim coordinates and inverse back to inDim.

components : Tensor α (Shape.dim outDim (Shape.dim inDim Shape.scalar))
components.
mean : Tensor α (Shape.dim inDim Shape.scalar)
mean.
explained_variance : Tensor α (Shape.dim outDim Shape.scalar)
explained variance.

Instances For

source

def Spec.pcaForwardSpec {α : Type} [Context α] {inDim outDim : ℕ} (m : PCASpec α inDim outDim) (input : Tensor α (Shape.dim inDim Shape.scalar)) :

Tensor α (Shape.dim outDim Shape.scalar)

Forward pass: center and project: y = components · (x - mean).

Instances For

source

def Spec.pcaBatchedForwardSpec {α : Type} [Context α] {batch inDim outDim : ℕ} (m : PCASpec α inDim outDim) (input : Tensor α (Shape.dim batch (Shape.dim inDim Shape.scalar))) :

Tensor α (Shape.dim batch (Shape.dim outDim Shape.scalar))

Batched forward pass: apply pca_forward_spec to each row.

Instances For

source

def Spec.pcaInverseSpec {α : Type} [Context α] {inDim outDim : ℕ} (m : PCASpec α inDim outDim) (reduced : Tensor α (Shape.dim outDim Shape.scalar)) :

Tensor α (Shape.dim inDim Shape.scalar)

Inverse transform: reconstruct x ≈ componentsᵀ · y + mean.

Instances For

source

def Spec.pcaComponentsDerivSpec {α : Type} [Context α] {inDim outDim : ℕ} (m : PCASpec α inDim outDim) (input : Tensor α (Shape.dim inDim Shape.scalar)) (grad_output : Tensor α (Shape.dim outDim Shape.scalar)) :

Tensor α (Shape.dim outDim (Shape.dim inDim Shape.scalar))

VJP contribution for components: outer product dL/dy ⊗ (x - mean).

Instances For

source

def Spec.pcaMeanDerivSpec {α : Type} [Context α] {inDim outDim : ℕ} (m : PCASpec α inDim outDim) (grad_output : Tensor α (Shape.dim outDim Shape.scalar)) :

Tensor α (Shape.dim inDim Shape.scalar)

VJP contribution for mean: dL/dmean = -componentsᵀ · dL/dy.

Instances For

source

def Spec.pcaInputDerivSpec {α : Type} [Context α] {inDim outDim : ℕ} (m : PCASpec α inDim outDim) (grad_output : Tensor α (Shape.dim outDim Shape.scalar)) :

Tensor α (Shape.dim inDim Shape.scalar)

VJP contribution for input: dL/dx = componentsᵀ · dL/dy.

Instances For

source

def Spec.pcaBackwardSpec {α : Type} [Context α] {inDim outDim : ℕ} (m : PCASpec α inDim outDim) (input : Tensor α (Shape.dim inDim Shape.scalar)) (grad_output : Tensor α (Shape.dim outDim Shape.scalar)) :

Tensor α (Shape.dim outDim (Shape.dim inDim Shape.scalar)) × Tensor α (Shape.dim inDim Shape.scalar) × Tensor α (Shape.dim inDim Shape.scalar)

Full backward pass returning (dComponents, dMean, dInput).

Instances For

source

def Spec.pcaFitSpec {α : Type} [Context α] {nSamples inDim : ℕ} (data : Tensor α (Shape.dim nSamples (Shape.dim inDim Shape.scalar))) (nComponents : ℕ) (h1 : 0 < nComponents) (h2 : nComponents ≤ inDim) (h3 : nSamples ≠ 0) :

PCASpec α inDim nComponents

Fit PCA using the (scaled) covariance matrix and eigendecomposition.

Algorithm:

compute the mean and center the data,
form the covariance matrix C = (1/(n-1)) Xᵀ X,
compute eigenpairs of C,
take the top nComponents eigenvectors,
orient eigenvectors deterministically (sign convention) so results are reproducible.

Note: this is a spec/reference implementation. In numerical libraries, PCA is often implemented via SVD for stability and performance.

Instances For

source

def Spec.pcaTransformSpec {α : Type} [Context α] {nSamples inDim outDim : ℕ} (m : PCASpec α inDim outDim) (data : Tensor α (Shape.dim nSamples (Shape.dim inDim Shape.scalar))) :

Tensor α (Shape.dim nSamples (Shape.dim outDim Shape.scalar))

Apply a fitted PCA transform to a batch of samples.

Instances For

source

def Spec.pcaReconstructionErrorSpec {α : Type} [Context α] {inDim outDim : ℕ} (m : PCASpec α inDim outDim) (input : Tensor α (Shape.dim inDim Shape.scalar)) (h : inDim ≠ 0) :

Reconstruction error: ||x - inverse(transform(x))||_2^2 (sum of squared coordinates).

PyTorch analogy: torch.sum((x - x_hat) ** 2).

Instances For

source

def Spec.pcaExplainedVarianceRatioSpec {α : Type} {inDim outDim : ℕ} (m : PCASpec α inDim outDim) :

Tensor α (Shape.dim outDim Shape.scalar)

Explained variance (eigenvalues of the selected components).

If you want the ratio (normalized to sum to 1), you need to divide by the total variance of the original data; this file keeps just the raw eigenvalues.

Instances For

source

def Spec.pcaCumulativeVarianceSpec {α : Type} [Add α] [Zero α] {inDim outDim : ℕ} (m : PCASpec α inDim outDim) :

Tensor α (Shape.dim outDim Shape.scalar)

Cumulative explained variance (prefix sums of explained_variance).

Instances For

source

@[irreducible]

def Spec.pcaCumulativeVarianceSpec.sum_to_index {α : Type} [Add α] {outDim : ℕ} (f : Fin outDim → Tensor α Shape.scalar) (i : Fin outDim) (j : ℕ) (acc : α) :

Instances For

TorchLean API

NN.Spec.Models.Pca

PCA (spec model) #