TorchLean API

NN.Spec.Models.Svm

Support Vector Machines (spec models) #

This file provides a small linear SVM baseline with explicit gradients.

PyTorch mental model:

There are two "layers" in this file:

Note on naming: classic SVM literature often uses a parameter C that weights the hinge term. In this file, fitLinearSVM takes a parameter named C, but we use it as the L2 regularization strength (the lambda in LinearSVM.backward) to keep the baseline small.

References:

Linear SVM (primal) #

structure LinearSVM (p : ) (α : Type) :

Linear SVM parameters: a weight vector w and bias b.

We intentionally keep "training hyperparameters" (regularization strength, learning rate, etc.) out of the parameter record; those are choices about an optimizer, not part of the model itself.

Instances For
    def LinearSVM.decision {α : Type} [Context α] {p : } (m : LinearSVM p α) (x : Spec.Tensor α (Spec.Shape.dim p Spec.Shape.scalar)) :
    α

    Decision function f(x) = w·x + b.

    Instances For

      Batch decision values for X : (n×p).

      Instances For
        def hingeLossPerExample {α : Type} [Context α] [DecidableRel fun (x1 x2 : α) => x1 > x2] (score y : α) :
        α

        Hinge loss per example: ℓ_i = max(0, 1 - y_i * f(x_i)).

        We write it using if rather than max to make the "active-set" logic explicit.

        Instances For
          def hingeLossMean {α : Type} [Context α] [DecidableRel fun (x1 x2 : α) => x1 > x2] {n : } (scores y : Spec.Tensor α (Spec.Shape.dim n Spec.Shape.scalar)) :
          α

          Mean hinge loss over a dataset.

          Instances For
            def LinearSVM.objective {α : Type} [Context α] [DecidableRel fun (x1 x2 : α) => x1 > x2] {n p : } (lambda : α) (m : LinearSVM p α) (X : Spec.Tensor α (Spec.Shape.dim n (Spec.Shape.dim p Spec.Shape.scalar))) (y : Spec.Tensor α (Spec.Shape.dim n Spec.Shape.scalar)) :
            α

            L2-regularized SVM objective (primal, soft-margin style).

            We use the common "½λ‖w‖² + mean hinge" form.

            Instances For

              Backward pass #

              For the objective

              L(w,b) = ½λ‖w‖² + (1/n) Σ max(0, 1 - y_i (w·x_i + b))

              the gradients are:

              We also return ∂L/∂X because it is sometimes useful for sensitivity analysis.

              PyTorch analogy: this is what autograd would compute for 0.5*λ*||w||^2 + mean(relu(1 - y*(X@w+b))), except we write it out explicitly.

              Backward/VJP for the linear SVM objective.

              Returns (dw, db, dX) where:

              • dw : ∂L/∂w
              • db : ∂L/∂b
              • dX : ∂L/∂X (sometimes useful for sensitivity analysis)
              Instances For

                A Small Training Wrapper (Gradient Descent) #

                The LinearSVM definitions above are enough for "spec math". For demos/tests, it is convenient to package a trained parameter pair together with a simple predictor, so we provide:

                structure SVM (p n : ) (α : Type) :

                Small trained SVM bundle for demos/tests.

                This is not a full SMO-style solver; it is a deterministic gradient-descent baseline that is useful as a reference model in the TorchLean spec layer.

                Instances For

                  Heuristic support-vector index extractor.

                  We mark an example as a "support vector" if its margin is close to 1. This is only meant for introspection and demos (it is not used by the optimizer).

                  Instances For
                    def fitLinearSVM {α : Type} [Context α] [DecidableRel fun (x1 x2 : α) => x1 > x2] {n p : } (X : Spec.Tensor α (Spec.Shape.dim n (Spec.Shape.dim p Spec.Shape.scalar))) (y : Spec.Tensor α (Spec.Shape.dim n Spec.Shape.scalar)) (learning_rate C : α) (iterations : ) :
                    SVM p n α

                    Fit a linear SVM by deterministic gradient descent on the primal objective.

                    Parameters:

                    • learning_rate: gradient step size
                    • C: regularization strength (treated as lambda)
                    • iterations: number of GD steps
                    Instances For
                      def fitLinearSVM.gradient_descent {α : Type} [Context α] [DecidableRel fun (x1 x2 : α) => x1 > x2] {n p : } (X : Spec.Tensor α (Spec.Shape.dim n (Spec.Shape.dim p Spec.Shape.scalar))) (y : Spec.Tensor α (Spec.Shape.dim n Spec.Shape.scalar)) (learning_rate C : α) (iter : ) (weights : Spec.Tensor α (Spec.Shape.dim p Spec.Shape.scalar)) (bias : α) :
                      Instances For
                        def predict {α : Type} [Context α] [DecidableRel fun (x1 x2 : α) => x1 > x2] {n p : } (model : SVM p n α) (X : Spec.Tensor α (Spec.Shape.dim n (Spec.Shape.dim p Spec.Shape.scalar))) :

                        Predict signed labels ±1 for a batch X using the learned hyperplane.

                        Instances For
                          def Kernel.linear {α : Type} [Context α] {p : } (x y : Spec.Tensor α (Spec.Shape.dim p Spec.Shape.scalar)) :
                          α

                          Linear kernel: k(x, y) = x·y.

                          Instances For
                            def Kernel.polynomial {α : Type} [Context α] {p : } (degree : ) (c : α) (x y : Spec.Tensor α (Spec.Shape.dim p Spec.Shape.scalar)) :
                            α

                            Polynomial kernel: k(x, y) = (x·y + c)^degree (naive power for generic α).

                            Instances For
                              def Kernel.polynomial.pow_rec {α : Type} [Context α] (base : α) (exp : ) :
                              α
                              Instances For
                                def Kernel.rbf {α : Type} [Context α] {p : } (gamma : α) (x y : Spec.Tensor α (Spec.Shape.dim p Spec.Shape.scalar)) {h : p 0} :
                                α

                                RBF kernel: k(x, y) = exp(-gamma * ||x - y||^2).

                                Instances For