TorchLean API

Docs Home Guide Examples Graphs

NN.MLTheory.SelfSupervised.PredictiveView

Predictive-view semantics for self-supervised learning #

This file is the objective-algebra layer for finite self-supervised learning.

The guiding split is:

a prediction/alignment term, over compatible finite views; and
a geometry/non-collapse term, such as variance, covariance, redundancy, negative-sample, or teacher-dynamics regularization.

The statements here are deliberately finite and method-neutral. They do not claim that SSL optimization learns good representations. They prove a smaller semantic fact that is useful for TorchLean and for paper writing: MAE, JEPA, VICReg-style guards, Barlow-style guards, and autoregressive prediction can share one objective shape.

In this finite layer, a view-prediction contract has:

targetIdxs, the selected masked/target indices;
a context value;
a target value at every finite index;
a target encoder, which chooses the target space;
a predictor from context and index; and
a nonnegative geometry guard.

MAE is recovered by choosing the identity target encoder into patch/pixel space. JEPA is recovered by choosing the target representation itself as the target space. VICReg/Barlow-style objectives are represented as geometry guards that can be added orthogonally to either prediction objective.

structure NN.MLTheory.SelfSupervised.PredictiveViewContract (n : ℕ) (Context Target TargetRep Prediction : Type) :

A finite predictive-view contract.

Target is the raw target-view type, TargetRep is the space after the target encoder, and Prediction is the predictor output space. Keeping these three types separate is the whole point: MAE sets TargetRep = Target with an identity encoder; JEPA uses a latent target representation; contrastive and redundancy-reduction methods can reuse the same contract with different geometry guards.

targetIdxs : List (Fin n)
Selected target/masked indices.
context : Context
Context view representation.
target : Fin n → Target
Target view before applying the target encoder.
targetEncoder : Fin n → Target → TargetRep
Target-space map. MAE uses identity into pixels/patches; JEPA uses a latent target branch.
predict : Context → Fin n → Prediction
Prediction made from the context view for a selected target index.
distance : TargetRep → Prediction → ℕ
Per-index predictive distance/alignment loss.
geometryGuard : ℕ
Geometry, spread, redundancy, or anti-collapse guard.

Instances For

def NN.MLTheory.SelfSupervised.predictiveLoss {n : ℕ} {Context Target TargetRep Prediction : Type} (contract : PredictiveViewContract n Context Target TargetRep Prediction) :

Prediction/alignment term of a finite predictive-view SSL objective.

Instances For

def NN.MLTheory.SelfSupervised.predictiveViewObjective {n : ℕ} {Context Target TargetRep Prediction : Type} (contract : PredictiveViewContract n Context Target TargetRep Prediction) :

Full finite SSL objective: predictive loss plus geometry/non-collapse guard.

Instances For

theorem NN.MLTheory.SelfSupervised.predictiveViewObjective_decomposes {n : ℕ} {Context Target TargetRep Prediction : Type} (contract : PredictiveViewContract n Context Target TargetRep Prediction) :

predictiveViewObjective contract = predictiveLoss contract + contract.geometryGuard

The generic SSL objective decomposes into prediction/alignment plus geometry guard.

@[simp]

theorem NN.MLTheory.SelfSupervised.predictiveViewObjective_zero_geometry {n : ℕ} {Context Target TargetRep Prediction : Type} (contract : PredictiveViewContract n Context Target TargetRep Prediction) (h : contract.geometryGuard = 0) :

predictiveViewObjective contract = predictiveLoss contract

If the geometry guard is zero, the full objective is exactly the predictive loss.

def NN.MLTheory.SelfSupervised.withGeometryGuard {n : ℕ} {Context Target TargetRep Prediction : Type} (contract : PredictiveViewContract n Context Target TargetRep Prediction) (guard : ℕ) :

PredictiveViewContract n Context Target TargetRep Prediction

Replace the geometry guard of a predictive contract.

This is how VICReg, Barlow-style redundancy reduction, InfoNCE-style uniformity, or a future Predictive-Hull coverage guard can be bolted onto the same prediction contract without changing the view-selection semantics.

Instances For

@[simp]

theorem NN.MLTheory.SelfSupervised.predictiveLoss_withGeometryGuard {n : ℕ} {Context Target TargetRep Prediction : Type} (contract : PredictiveViewContract n Context Target TargetRep Prediction) (guard : ℕ) :

predictiveLoss (withGeometryGuard contract guard) = predictiveLoss contract

@[simp]

theorem NN.MLTheory.SelfSupervised.predictiveViewObjective_withGeometryGuard {n : ℕ} {Context Target TargetRep Prediction : Type} (contract : PredictiveViewContract n Context Target TargetRep Prediction) (guard : ℕ) :

predictiveViewObjective (withGeometryGuard contract guard) = predictiveLoss contract + guard

MAE as predictive-view SSL with identity target encoder #

def NN.MLTheory.SelfSupervised.maeAsPredictiveViewContract {n : ℕ} {Patch Pred : Type} (maskedIdxs : List (Fin n)) (target : PatchBatch n Patch) (pred : Fin n → Pred) (patchLoss : Patch → Pred → ℕ) :

PredictiveViewContract n Unit Patch Patch Pred

MAE as a predictive-view contract.

The context is Unit because the finite MAE skeleton already abstracts away the encoder. The target encoder is identity into patch/pixel space.

Instances For

theorem NN.MLTheory.SelfSupervised.mae_is_predictive_view_loss {n : ℕ} {Patch Pred : Type} (maskedIdxs : List (Fin n)) (target : PatchBatch n Patch) (pred : Fin n → Pred) (patchLoss : Patch → Pred → ℕ) :

predictiveLoss (maeAsPredictiveViewContract maskedIdxs target pred patchLoss) = maeLoss maskedIdxs target pred patchLoss

MAE's masked reconstruction loss is the predictive term with identity target encoder.

theorem NN.MLTheory.SelfSupervised.mae_is_predictive_view_objective {n : ℕ} {Patch Pred : Type} (maskedIdxs : List (Fin n)) (target : PatchBatch n Patch) (pred : Fin n → Pred) (patchLoss : Patch → Pred → ℕ) :

predictiveViewObjective (maeAsPredictiveViewContract maskedIdxs target pred patchLoss) = maeLoss maskedIdxs target pred patchLoss

MAE is the zero-geometry predictive-view objective with pixel/patch identity targets.

JEPA as predictive-view SSL with latent target representation #

def NN.MLTheory.SelfSupervised.jepaAsPredictiveViewContract {n : ℕ} {Context Target Pred : Type} (targetIdxs : List (Fin n)) (context : Context) (target : Fin n → Target) (predict : Context → Fin n → Pred) (repLoss : Target → Pred → ℕ) :

PredictiveViewContract n Context Target Target Pred

JEPA as a predictive-view contract when the finite target is already a target representation.

This matches jepaLoss: target representations are values at the objective boundary, and the predictor tries to match them at selected target indices.

Instances For

theorem NN.MLTheory.SelfSupervised.jepa_is_predictive_view_loss {n : ℕ} {Context Target Pred : Type} (targetIdxs : List (Fin n)) (context : Context) (target : Fin n → Target) (predict : Context → Fin n → Pred) (repLoss : Target → Pred → ℕ) :

predictiveLoss (jepaAsPredictiveViewContract targetIdxs context target predict repLoss) = jepaLoss targetIdxs context target predict repLoss

JEPA's finite target-representation loss is the predictive-view loss.

theorem NN.MLTheory.SelfSupervised.jepa_is_predictive_view_objective {n : ℕ} {Context Target Pred : Type} (targetIdxs : List (Fin n)) (context : Context) (target : Fin n → Target) (predict : Context → Fin n → Pred) (repLoss : Target → Pred → ℕ) :

predictiveViewObjective (jepaAsPredictiveViewContract targetIdxs context target predict repLoss) = jepaLoss targetIdxs context target predict repLoss

JEPA is the zero-geometry predictive-view objective with latent target values.

def NN.MLTheory.SelfSupervised.encodedTargetPredictiveViewContract {n : ℕ} {Context Target TargetRep Pred : Type} (targetIdxs : List (Fin n)) (context : Context) (target : Fin n → Target) (targetEncoder : Fin n → Target → TargetRep) (predict : Context → Fin n → Pred) (repLoss : TargetRep → Pred → ℕ) (geometryGuard : ℕ := 0) :

PredictiveViewContract n Context Target TargetRep Pred

More general JEPA/predictive-view contract with a separate target encoder.

This is the paper bridge: changing targetEncoder changes the target space while leaving the finite view-prediction algebra alone. MAE is the special case where this encoder is identity into pixels/patches; JEPA uses a latent/stopped target branch.

Instances For

theorem NN.MLTheory.SelfSupervised.encodedTargetPredictiveViewContract_loss_eq_maskedLoss {n : ℕ} {Context Target TargetRep Pred : Type} (targetIdxs : List (Fin n)) (context : Context) (target : Fin n → Target) (targetEncoder : Fin n → Target → TargetRep) (predict : Context → Fin n → Pred) (repLoss : TargetRep → Pred → ℕ) (geometryGuard : ℕ := 0) :

predictiveLoss (encodedTargetPredictiveViewContract targetIdxs context target targetEncoder predict repLoss geometryGuard) = maskedLoss targetIdxs fun (i : Fin n) => repLoss (targetEncoder i (target i)) (predict context i)

Geometry guards as reusable SSL modules #

structure NN.MLTheory.SelfSupervised.VICRegGuard :

A VICReg-style geometry guard packaged for predictive-view objectives.

lambda : ℕ
Weight for invariance/alignment summary.
mu : ℕ
Weight for variance/non-collapse summary.
nu : ℕ
Weight for covariance/redundancy summary.
invariance : ℕ
Already-computed invariance summary.
variance : ℕ
Already-computed variance-floor summary.
covariance : ℕ
Already-computed covariance/redundancy summary.

Instances For

def NN.MLTheory.SelfSupervised.VICRegGuard.value (guard : VICRegGuard) :

Evaluate a finite VICReg-style guard through the existing VICReg objective.

Instances For

structure NN.MLTheory.SelfSupervised.BarlowGuard :

A Barlow-Twins-style redundancy guard packaged for predictive-view objectives.

lambda : ℕ
Weight for off-diagonal redundancy terms.
diag : List ℕ
Diagonal cross-correlation summaries, ideal value 1.
offDiag : List ℕ
Off-diagonal cross-correlation summaries, ideal value 0.

Instances For

def NN.MLTheory.SelfSupervised.BarlowGuard.value (guard : BarlowGuard) :

Evaluate a finite Barlow-style redundancy guard.

Instances For

theorem NN.MLTheory.SelfSupervised.predictiveViewObjective_with_vicreg_guard {n : ℕ} {Context Target TargetRep Prediction : Type} (contract : PredictiveViewContract n Context Target TargetRep Prediction) (guard : VICRegGuard) :

predictiveViewObjective (withGeometryGuard contract guard.value) = predictiveLoss contract + guard.value

Adding a VICReg guard gives prediction plus the VICReg geometry value.

theorem NN.MLTheory.SelfSupervised.predictiveViewObjective_with_barlow_guard {n : ℕ} {Context Target TargetRep Prediction : Type} (contract : PredictiveViewContract n Context Target TargetRep Prediction) (guard : BarlowGuard) :

predictiveViewObjective (withGeometryGuard contract guard.value) = predictiveLoss contract + guard.value

Adding a Barlow-style guard gives prediction plus the redundancy-reduction geometry value.

theorem NN.MLTheory.SelfSupervised.vicreg_guard_variance_only_positive {mu variance : ℕ} (hμ : 0 < mu) (hv : 0 < variance) :

0 < { lambda := 0, mu := mu, nu := 0, invariance := 0, variance := variance, covariance := 0 }.value

A pure variance VICReg guard is positive when both the variance weight and variance summary are positive. This is the finite anti-collapse card used by the generic predictive-view algebra.

@[simp]

theorem NN.MLTheory.SelfSupervised.barlow_guard_identity_value_zero (lambda d k : ℕ) :

{ lambda := lambda, diag := List.replicate d 1, offDiag := List.replicate k 0 }.value = 0

The ideal Barlow-style guard has zero value.

theorem NN.MLTheory.SelfSupervised.barlow_guard_collapsed_diag_positive {lambda d k : ℕ} :

0 < { lambda := lambda, diag := 0 :: List.replicate d 1, offDiag := List.replicate k 0 }.value

A collapsed diagonal entry pays a positive Barlow-style redundancy guard.

Finite view-graph reading #

structure NN.MLTheory.SelfSupervised.SSLViewGraph (n : ℕ) :

A finite positive-view edge graph over n views.

positiveEdges : List (Fin n × Fin n)
Positive/compatible view edges.

Instances For

def NN.MLTheory.SelfSupervised.viewGraphEnergy {n : ℕ} (graph : SSLViewGraph n) (edgeLoss : Fin n → Fin n → ℕ) :

Edge energy for a finite positive-view graph.

Instances For

Concrete finite Euclidean geometry #

The generic objective above is intentionally method-neutral. The definitions below add pressure: representations are finite real vectors, alignment is squared Euclidean energy on a finite positive view graph, and non-collapse is expressed as a real variance-floor guard over coordinate-spread summaries.

These theorems capture an important SSL fact in a checked finite setting:

graph alignment alone is nonnegative but accepts fully collapsed representations with zero loss;
a positive variance floor assigns positive penalty to collapsed coordinate-spread summaries.

@[reducible, inline]

abbrev NN.MLTheory.SelfSupervised.EuclideanRep (d : ℕ) :

A finite real embedding vector.

Instances For

noncomputable def NN.MLTheory.SelfSupervised.sqDist {d : ℕ} (z w : EuclideanRep d) :

Squared Euclidean distance between two finite real embeddings.

Instances For

theorem NN.MLTheory.SelfSupervised.sqDist_nonneg {d : ℕ} (z w : EuclideanRep d) :

0 ≤ sqDist z w

Squared Euclidean distance is nonnegative.

@[simp]

theorem NN.MLTheory.SelfSupervised.sqDist_self {d : ℕ} (z : EuclideanRep d) :

sqDist z z = 0

A vector has zero squared distance from itself.

noncomputable def NN.MLTheory.SelfSupervised.graphAlignmentEnergy {n d : ℕ} (graph : SSLViewGraph n) (rep : Fin n → EuclideanRep d) :

Real-valued alignment energy induced by positive edges in a finite view graph.

Instances For

theorem NN.MLTheory.SelfSupervised.graphAlignmentEnergy_nonneg {n d : ℕ} (graph : SSLViewGraph n) (rep : Fin n → EuclideanRep d) :

0 ≤ graphAlignmentEnergy graph rep

Finite graph alignment energy is nonnegative.

def NN.MLTheory.SelfSupervised.CollapsedRep {n d : ℕ} (rep : Fin n → EuclideanRep d) :

A collapsed representation maps every view to the same finite vector.

Instances For

theorem NN.MLTheory.SelfSupervised.graphAlignmentEnergy_eq_zero_of_collapsed {n d : ℕ} (graph : SSLViewGraph n) (rep : Fin n → EuclideanRep d) (hcollapsed : CollapsedRep rep) :

graphAlignmentEnergy graph rep = 0

Alignment alone cannot prevent collapse: any constant representation has zero positive-edge energy, no matter what the view graph is.

noncomputable def NN.MLTheory.SelfSupervised.coordinateSpread {n d : ℕ} (rep : Fin n → EuclideanRep d) (j : Fin d) :

Coordinate spread is a finite pairwise squared-difference summary for one embedding coordinate.

This avoids asymptotic probability or population assumptions while still expressing the core "does this coordinate vary across the batch/views?" question used by finite anti-collapse guards.

Instances For

theorem NN.MLTheory.SelfSupervised.coordinateSpread_eq_zero_of_collapsed {n d : ℕ} (rep : Fin n → EuclideanRep d) (hcollapsed : CollapsedRep rep) (j : Fin d) :

coordinateSpread rep j = 0

A collapsed representation has zero spread in every coordinate.

noncomputable def NN.MLTheory.SelfSupervised.realVarianceFloorPenalty (gamma spread : ℝ) :

Real-valued variance-floor penalty: max(0, gamma - spread).

Instances For

noncomputable def NN.MLTheory.SelfSupervised.realVarianceFloorGuard {d : ℕ} (gamma : ℝ) (spread : Fin d → ℝ) :

A finite real variance-floor guard over coordinate-spread summaries.

Instances For

theorem NN.MLTheory.SelfSupervised.realVarianceFloorGuard_zero_spread {d : ℕ} {gamma : ℝ} (hgamma : 0 ≤ gamma) :

(realVarianceFloorGuard gamma fun (x : Fin d) => 0) = ↑d * gamma

Zero spread in every coordinate pays exactly d * gamma when gamma is nonnegative.

theorem NN.MLTheory.SelfSupervised.realVarianceFloorGuard_zero_spread_positive {d : ℕ} {gamma : ℝ} (hd : 0 < d) (hgamma : 0 < gamma) :

0 < realVarianceFloorGuard gamma fun (x : Fin d) => 0

Collapsed coordinate-spread summaries pay a positive variance-floor guard in nonzero dimension.

noncomputable def NN.MLTheory.SelfSupervised.graphSSLObjective {n d : ℕ} (graph : SSLViewGraph n) (rep : Fin n → EuclideanRep d) (gamma : ℝ) :

The concrete finite alignment-plus-spread objective.

This is the graph-theoretic SSL reading: compatible views should align along positive edges, while the spread guard prevents the trivial all-views-identical representation from being accepted for free.

Instances For

theorem NN.MLTheory.SelfSupervised.graphSSLObjective_eq_guard_of_collapsed {n d : ℕ} (graph : SSLViewGraph n) (rep : Fin n → EuclideanRep d) (gamma : ℝ) (hcollapsed : CollapsedRep rep) :

graphSSLObjective graph rep gamma = realVarianceFloorGuard gamma fun (x : Fin d) => 0

For a collapsed representation, the alignment term is zero, so the objective reduces to the variance-floor guard computed from zero coordinate spread.

theorem NN.MLTheory.SelfSupervised.graphSSLObjective_collapsed_positive {n d : ℕ} (graph : SSLViewGraph n) (rep : Fin n → EuclideanRep d) {gamma : ℝ} (hcollapsed : CollapsedRep rep) (hd : 0 < d) (hgamma : 0 < gamma) :

0 < graphSSLObjective graph rep gamma

With positive dimension and positive variance floor, a collapsed representation pays positive finite SSL objective value. This is the concrete theorem version of "alignment needs a spread guard."

theorem NN.MLTheory.SelfSupervised.predictiveLoss_eq_viewGraphEnergy_from_anchor {n : ℕ} {Context Target TargetRep Prediction : Type} (contract : PredictiveViewContract n Context Target TargetRep Prediction) (anchor : Fin n) :

predictiveLoss contract = viewGraphEnergy { positiveEdges := List.map (fun (i : Fin n) => (anchor, i)) contract.targetIdxs } fun (x i : Fin n) => contract.distance (contract.targetEncoder i (contract.target i)) (contract.predict contract.context i)

Any target-index predictive objective can be read as a graph energy from one context anchor to each selected target index.

The context anchor is represented by the same finite index type. This theorem is intentionally simple: it is the finite bridge between masked/context-target prediction and graph-style SSL alignment energy.