TorchLean API

Docs Home Guide Examples Graphs

NN.Spec.Models.Svm

Support Vector Machines (spec models) #

This file provides a small linear SVM baseline with explicit gradients.

PyTorch mental model:

scoring function: score = X @ w + b (like nn.Linear(p, 1) without an activation),
loss: hinge loss on signed labels y ∈ {−1, +1}: loss_i = max(0, 1 - y_i * score_i),
optimization: a small deterministic gradient descent loop (not an optimized solver).

There are two "layers" in this file:

LinearSVM: the clean mathematical model + objective + backward pass (VJP-style gradients);
fitLinearSVM/predict: a small training + prediction wrapper used by smoke tests and demos.

Note on naming: classic SVM literature often uses a parameter C that weights the hinge term. In this file, fitLinearSVM takes a parameter named C, but we use it as the L2 regularization strength (the lambda in LinearSVM.backward) to keep the baseline small.

References:

Cortes and Vapnik, "Support-Vector Networks", 1995.
Vapnik, "The Nature of Statistical Learning Theory", 1995/1998.

Linear SVM (primal) #

structure LinearSVM (p : ℕ) (α : Type) :

Linear SVM parameters: a weight vector w and bias b.

We intentionally keep "training hyperparameters" (regularization strength, learning rate, etc.) out of the parameter record; those are choices about an optimizer, not part of the model itself.

w : Spec.Tensor α (Spec.Shape.dim p Spec.Shape.scalar)
w.
b : α
b.

Instances For

def LinearSVM.decision {α : Type} [Context α] {p : ℕ} (m : LinearSVM p α) (x : Spec.Tensor α (Spec.Shape.dim p Spec.Shape.scalar)) :

α

Decision function f(x) = w·x + b.

Instances For

def LinearSVM.decisionBatch {α : Type} [Context α] {n p : ℕ} (m : LinearSVM p α) (X : Spec.Tensor α (Spec.Shape.dim n (Spec.Shape.dim p Spec.Shape.scalar))) :

Spec.Tensor α (Spec.Shape.dim n Spec.Shape.scalar)

Batch decision values for X : (n×p).

Instances For

def hingeLossPerExample {α : Type} [Context α] [DecidableRel fun (x1 x2 : α) => x1 > x2] (score y : α) :

α

Hinge loss per example: ℓ_i = max(0, 1 - y_i * f(x_i)).

We write it using if rather than max to make the "active-set" logic explicit.

Instances For

def hingeLossMean {α : Type} [Context α] [DecidableRel fun (x1 x2 : α) => x1 > x2] {n : ℕ} (scores y : Spec.Tensor α (Spec.Shape.dim n Spec.Shape.scalar)) :

α

Mean hinge loss over a dataset.

Instances For

def LinearSVM.objective {α : Type} [Context α] [DecidableRel fun (x1 x2 : α) => x1 > x2] {n p : ℕ} (lambda : α) (m : LinearSVM p α) (X : Spec.Tensor α (Spec.Shape.dim n (Spec.Shape.dim p Spec.Shape.scalar))) (y : Spec.Tensor α (Spec.Shape.dim n Spec.Shape.scalar)) :

α

L2-regularized SVM objective (primal, soft-margin style).

We use the common "½λ‖w‖² + mean hinge" form.

Instances For

Backward pass #

For the objective

L(w,b) = ½λ‖w‖² + (1/n) Σ max(0, 1 - y_i (w·x_i + b))

the gradients are:

∂L/∂w = λ w + (1/n) Σ [margin_i < 1] * (-y_i x_i)
∂L/∂b = (1/n) Σ [margin_i < 1] * (-y_i)

We also return ∂L/∂X because it is sometimes useful for sensitivity analysis.

PyTorch analogy: this is what autograd would compute for 0.5*λ*||w||^2 + mean(relu(1 - y*(X@w+b))), except we write it out explicitly.

def LinearSVM.backward {α : Type} [Context α] [DecidableRel fun (x1 x2 : α) => x1 > x2] {n p : ℕ} (lambda : α) (m : LinearSVM p α) (X : Spec.Tensor α (Spec.Shape.dim n (Spec.Shape.dim p Spec.Shape.scalar))) (y : Spec.Tensor α (Spec.Shape.dim n Spec.Shape.scalar)) :

Spec.Tensor α (Spec.Shape.dim p Spec.Shape.scalar) × α × Spec.Tensor α (Spec.Shape.dim n (Spec.Shape.dim p Spec.Shape.scalar))

Backward/VJP for the linear SVM objective.

Returns (dw, db, dX) where:

dw : ∂L/∂w
db : ∂L/∂b
dX : ∂L/∂X (sometimes useful for sensitivity analysis)

Instances For

A Small Training Wrapper (Gradient Descent) #

The LinearSVM definitions above are enough for "spec math". For demos/tests, it is convenient to package a trained parameter pair together with a simple predictor, so we provide:

SVM: a small record holding (weights, bias) and a heuristic support-vector index tensor,
fitLinearSVM: deterministic gradient descent using LinearSVM.backward,
predict: sign prediction as ±1.

structure SVM (p n : ℕ) (α : Type) :

Small trained SVM bundle for demos/tests.

This is not a full SMO-style solver; it is a deterministic gradient-descent baseline that is useful as a reference model in the TorchLean spec layer.

weights : Spec.Tensor α (Spec.Shape.dim p Spec.Shape.scalar)
Normal vector w of the separating hyperplane.
bias : α
Bias/intercept term b.
supportVectorIndices : Spec.Tensor ℕ (Spec.Shape.dim n Spec.Shape.scalar)
Heuristic support-vector indices (approximate: margin near 1).

Instances For

def findSupportVectorIndices {α : Type} [Context α] [DecidableRel fun (x1 x2 : α) => x1 > x2] {n p : ℕ} (X : Spec.Tensor α (Spec.Shape.dim n (Spec.Shape.dim p Spec.Shape.scalar))) (y : Spec.Tensor α (Spec.Shape.dim n Spec.Shape.scalar)) (final_weights : Spec.Tensor α (Spec.Shape.dim p Spec.Shape.scalar)) (final_bias : α) :

Spec.Tensor ℕ (Spec.Shape.dim n Spec.Shape.scalar)

Heuristic support-vector index extractor.

We mark an example as a "support vector" if its margin is close to 1. This is only meant for introspection and demos (it is not used by the optimizer).

Instances For

def fitLinearSVM {α : Type} [Context α] [DecidableRel fun (x1 x2 : α) => x1 > x2] {n p : ℕ} (X : Spec.Tensor α (Spec.Shape.dim n (Spec.Shape.dim p Spec.Shape.scalar))) (y : Spec.Tensor α (Spec.Shape.dim n Spec.Shape.scalar)) (learning_rate C : α) (iterations : ℕ) :

SVM p n α

Fit a linear SVM by deterministic gradient descent on the primal objective.

Parameters:

learning_rate: gradient step size
C: regularization strength (treated as lambda)
iterations: number of GD steps

Instances For

def fitLinearSVM.gradient_descent {α : Type} [Context α] [DecidableRel fun (x1 x2 : α) => x1 > x2] {n p : ℕ} (X : Spec.Tensor α (Spec.Shape.dim n (Spec.Shape.dim p Spec.Shape.scalar))) (y : Spec.Tensor α (Spec.Shape.dim n Spec.Shape.scalar)) (learning_rate C : α) (iter : ℕ) (weights : Spec.Tensor α (Spec.Shape.dim p Spec.Shape.scalar)) (bias : α) :

Spec.Tensor α (Spec.Shape.dim p Spec.Shape.scalar) × α

Instances For

def predict {α : Type} [Context α] [DecidableRel fun (x1 x2 : α) => x1 > x2] {n p : ℕ} (model : SVM p n α) (X : Spec.Tensor α (Spec.Shape.dim n (Spec.Shape.dim p Spec.Shape.scalar))) :

Spec.Tensor α (Spec.Shape.dim n Spec.Shape.scalar)

Predict signed labels ±1 for a batch X using the learned hyperplane.

Instances For

def Kernel.linear {α : Type} [Context α] {p : ℕ} (x y : Spec.Tensor α (Spec.Shape.dim p Spec.Shape.scalar)) :

α

Linear kernel: k(x, y) = x·y.

Instances For

def Kernel.polynomial {α : Type} [Context α] {p : ℕ} (degree : ℕ) (c : α) (x y : Spec.Tensor α (Spec.Shape.dim p Spec.Shape.scalar)) :

α

Polynomial kernel: k(x, y) = (x·y + c)^degree (naive power for generic α).

Instances For

def Kernel.polynomial.pow_rec {α : Type} [Context α] (base : α) (exp : ℕ) :

α

Instances For

def Kernel.rbf {α : Type} [Context α] {p : ℕ} (gamma : α) (x y : Spec.Tensor α (Spec.Shape.dim p Spec.Shape.scalar)) {h : p ≠ 0} :

α

RBF kernel: k(x, y) = exp(-gamma * ||x - y||^2).

Instances For